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PREFACE 


The importance of discrete and combinatorial mathematics has increased dramatically 
within the last few years. The purpose of the Handbook of Discrete and Combinatorial 
Mathematics is to provide a comprehensive reference volume for computer scientists, 
engineers, mathematicians, and others, such as students, physical and social scientists, 
and reference librarians, who need information about discrete and combinatorial math- 
ematics. 


This book is the first resource that presents such information in a ready-reference form 
designed for use by all those who use aspects of this subject in their work or studies. 
The scope of this book includes the many areas generally considered to be parts of 
discrete mathematics, focusing on the information considered essential to its application 
in computer science and engineering. Some of the fundamental topic areas covered 
include: 


logic and set theory 
enumeration 
integer sequences 
recurrence relations 
generating functions 
number theory 
abstract algebra 
linear algebra 
discrete probability theory 


graph theory 
trees 

network sequences 
combinatorial designs 
computational geometry 
coding theory and cryptography 
discrete optimization 
automata theory 
data structures and algorithms. 


Format 

The material in the Handbook is presented so that key information can be located 
and used quickly and easily. Each chapter includes a glossary that provides succinct 
definitions of the most important terms from that chapter. Individual topics are cov- 
ered in sections and subsections within chapters, each of which is organized into clearly 
identifiable parts: definitions, facts, and examples. The definitions included are care- 
fully crafted to help readers quickly grasp new concepts. Important notation is also 
highlighted in the definitions. Lists of facts include: 

• information about how material is used and why it is important 

• historical information 

• key theorems 

• the latest results 

• the status of open questions 

• tables of numerical values, generally not easily computed 

• summary tables 

• key algorithms in an easily understood pseudocode 

• information about algorithms, such as their complexity 

• major applications 

• pointers to additional resources, including websites and printed material. 
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Facts are presented concisely and are listed so that they can be easily found and un- 
derstood. Extensive crossreferences linking parts of the handbook are also provided. 
Readers who want to study a topic further can consult the resources listed. 

The material in the Handbook has been chosen for inclusion primarily because it is 
important and useful. Additional material has been added to ensure comprehensiveness 
so that readers encountering new terminology and concepts from discrete mathematics 
in their explorations will be able to get help from this book. 

Examples are provided to illustrate some of the key definitions, facts, and algorithms. 
Some curious and entertaining facts and puzzles that some readers may find intriguing 
are also included. 

Each chapter of the book includes a list of references divided into a list of printed 
resources and a list of relevant websites. 

How This Book Was Developed 

The organization and structure of the Handbook were developed by a team which in- 
cluded the chief editor, three associate editors, the project editor, and the editor from 
CRC Press. This team put together a proposed table of contents which was then ana- 
lyzed by members of a group of advisory editors, each an expert in one or more aspects 
of discrete mathematics. These advisory editors suggested changes, including the cover- 
age of additional important topics. Once the table of contents was fully developed, the 
individual sections of the book were prepared by a group of more than 70 contributors 
from industry and academia who understand how this material is used and why it is 
important. Contributors worked under the direction of the associate editors and chief 
editor, with these editors ensuring consistency of style and clarity and comprehensive- 
ness in the presentation of material. Material was carefully reviewed by authors and 
our team of editors to ensure accuracy and consistency of style. 

The CRC Press Series on Discrete Mathematics and Its Applications 

This Handbook is designed to be a ready reference that covers many important distinct 
topics. People needing information in multiple areas of discrete and combinatorial 
mathematics need only have this one volume to obtain what they need or for pointers 
to where they can find out more information. Among the most valuable sources of 
additional information are the volumes in the CRC Press Series on Discrete Mathematics 
and Its Applications. This series includes both Handbooks, which are ready references, 
and advanced Textbooks/Monographs. More detailed and comprehensive coverage in 
particular topic areas can be found in these individual volumes: 

Handbooks 

• The CRC Handbook of Combinatorial Designs 

• Handbook of Discrete and Computational Geometry 

• Handbook of Applied Cryptography 

Textbooks /Monographs 

• Graph Theory and its Applications 

• Algebraic Number Theory 

• Quadratics 
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• Design Theory 

• Frames and Resolvable Designs: Uses, Constructions , and Existence 

• Network Reliability: Experiments with a Symbolic Algebra Environment 

• Fundamental Number Theory with Applications 

• Cryptography: Theory and Practice 

• Introduction to Information Theory and Data Compression 

• Combinatorial Algorithms: Generation, Enumeration, and Search 


Feedback 

To see updates and to provide feedback and errata reports, please consult the Web page 
for this book. This page can be accessed by first going to the CRC website at 

http : //www. crcpress . com 

and then following the links to the Web page for this book. 
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BIOGRAPHIES 


Victor J. Katz 


Niels Henrik Abel (1802-1829), born in Norway, was self-taught and studied the 
works of many mathematicians. When he was nineteen years old, he proved that 
there is no closed formula for solving the general fifth degree equation. He also 
worked in the areas of infinite series and elliptic functions and integrals. The term 
abelian group was coined in Abel’s honor in 1870 by Camille Jordan. 

Abraham ibn Ezra (1089-1164) was a Spanish-Jewish poet, philosopher, astrologer, 
and biblical commentator who was born in Tudela, but spent the latter part of 
his life as a wandering scholar in Italy, France, England, and Palestine. It was in 
an astrological text that ibn Ezra developed a method for calculating numbers of 
combinations, in connection with determining the number of possible conjunctions of 
the seven “planets” (including the sun and the moon). He gave a detailed argument 
for the cases n = 7, k = 2 to 7, of a rule which can easily be generalize to the modern 
formula C(n, k ) = Y^7=k-i C(i,k — 1). Ibn Ezra also wrote a work on arithmetic in 
which he introduced the Hebrew-speaking community to the decimal place-value 
system. He used the first nine letters of the Hebrew alphabet to represent the first 
nine numbers, used a circle to represent zero, and demonstrated various algorithms 
for calculation in this system. 

Aristotle (384-322 B.C.E.) was the most famous student at Plato’s academy in Athens. 
After Plato’s death in 347 B.C.E. , he was invited to the court of Philip II of Mace- 
don to educate Philip’s son Alexander, who soon thereafter began his successful 
conquest of the Mediterranean world. Aristotle himself returned to Athens, where 
he founded his own school, the Lyceum, and spent the remainder of his life writing 
and lecturing. He wrote on numerous subjects, but is perhaps best known for his 
works on logic, including the Prior Analytics and the Posterior Analytics. In these 
works, Aristotle developed the notion of logical argument, based on several explicit 
principles. In particular, he built his arguments out of syllogisms and concluded that 
demonstrations using his procedures were the only certain way of attaining scientific 
knowledge. 

Emil Artin (1898-1962) was born in Vienna and in 1921 received a Ph.D. from the Uni- 
versity of Leipzig. He held a professorship at the University of Hamburg until 1937, 
when he came to the United States. In the U.S. he taught at the University of Notre 
Dame, Indiana University, and Princeton. In 1958 he returned to the University 
of Hamburg. Artin’s mathematical contributions were in number theory, algebraic 
topology, linear algebra, and especially in many areas of abstract algebra. 

Charles Babbage (1792-1871) was an English mathematician best known for his in- 
vention of two of the earliest computing machines, the Difference Engine , designed 
to calculate polynomial functions, and the Analytical Engine, a general purpose cal- 
culating machine. The Difference Engine was designed to use the idea that the ?rth 
order differences in nth degree polynomials were always constant and then to work 
backwards from those differences to the original polynomial values. Although Bab- 
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bage received a grant from the British government to help in building the Engine, he 
never was able to complete one because of various difficulties in developing machine 
parts of sufficient accuracy. In addition, Babbage became interested in his more 
advanced Analytical Engine. This latter device was to consist of a store , in which 
the numerical variables were kept, and a mill, in which the operations were per- 
formed. The entire machine was to be controlled by instructions on punched cards. 
Unfortunately, although Babbage made numerous engineering drawings of sections 
of the Analytical Engine and gave a series of seminars in 1840 on its workings, he 
was never able to build a working model. 

Paul Gustav Heinrich Bachmann (1837-1920) studied mathematics at the Univer- 
sity of Berlin and at Gottingen. In 1862 he received a doctorate in group theory and 
held positions at the universities at Breslau and Munster. He wrote several volumes 
on number theory, introducing the big-O notation in his 1892 book. 

John Backus (born 1924) received bachelor’s and master’s degrees in mathematics 
from Columbia University. He led the group at IBM that developed FORTRAN. 
He was a developer of ALGOL, using the Backus-Naur form for the syntax of the 
language. He received the National Medal of Science in 1974 and the Turing Award 
in 1977. 

Abu-l-’Abbas Ahmad ibn Muhammad ibn al-Banna al-Marrakushi (1256- 
1321) was an Islamic mathematician who lived in Marrakech in what is now Morocco. 
Ibn al-Banna developed the first known proof of the basic combinatorial formulas, 
beginning by showing that the number of permutations of a set of n elements was n! 
and then developing in a careful manner the multiplicative formula to compute the 
values for the number of combinations of k objects in a set of n. Using these two 
results, he also showed how to calculate the number of permutations of k objects from 
a set of n. The formulas themselves had been known in the Islamic world for many 
years, in connection with specific problems like calculating the number of words of 
a given length which could be formed from the letters of the Arabic alphabet. Ibn 
al-Banna’s main contribution, then, was to abstract the general idea of permutations 
and combinations out of the various specific problem situations considered earlier. 

Thomas Bayes (1702-1761) an English Nonconformist, wrote an Introduction to the 
Doctrine of Fluxions in 1736 as a response to Berkeley’s Analyst with its severe crit- 
icism of the foundations of the calculus. He is best known, however, for attempting 
to answer the basic question of statistical inference in his An Essay Towards Solving 
a Problem in the Doctrine of Chances, published three years after his death. That 
basic question is to determine the probability of an event, given empirical evidence 
that it has occurred a certain number of times in a certain number of trials. To do 
this, Bayes gave a straightforward definition of probability and then proved that for 
two events E and F, the probability of E given that F has happened is the quo- 
tient of the probability of both E and F happening divided by the probability of F 
alone. By using areas to model probability, he was then able to show that, if x is the 
probability of an event happening in a single trial, if the event has happened p times 
in n trials, and if0<r<s<l, then the probability that x is between r and s is 
given by the quotient of two integrals. Although in principle these integrals can be 
calculated, there has been a great debate since Bayes’ time about the circumstances 
under which his formula gives an appropriate answer. 

James Bernoulli ( Jakob I) (1654-1705) was one of eight mathematicians in three 
generations of his family. He was born in Basel, Switzerland, studied theology in 
addition to mathematics and astronomy, and entered the ministry. In 1682 be began 
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to lecture at the University of Basil in natural philosophy and mechanics. He became 
professor at the University of Basel in 1687, and remained there until his death. His 
research included the areas of the calculus of variations, probability, and analytic 
geometry. His most well-known work is Ars Conjectandi, in which he described 
results in combinatorics and probability, including applications to gambling and the 
law of large numbers; this work also contained a reprint of the first formal treatise 
in probability, written in 1657 by Christiaan Huygens. 

Bhaskara (1114-1185), the most famous of medieval Indian mathematicians, gave a 
complete algorithmic solution to the Pell equation Dx 2 ±1 = y 2 . That equation had 
been studied by several earlier Indian mathematicians as well. Bhaskara served much 
of his adult life as the head of the astronomical observatory at Ujjain, some 300 miles 
northeast of Bombay, and became widely respected for his skills in astronomy and the 
mechanical arts, as well as mathematics. Bhaskara’s mathematical contributions are 
chiefly found in two chapters, the Lilavati and the Bijaganita, of a major astronomical 
work, the Siddhantasir omani. These include techniques of solving systems of linear 
equations with more unknowns than equations as well as the basic combinatorial 
formulas, although without any proofs. 

George Boole (1815-1864) was an English mathematician most famous for his work 
in logic. Born the son of a cobbler, he had to struggle to educate himself while 
supporting his family. But he was so successful in his self-education that he was able 
to set up his own school before he was 20 and was asked to give lectures on the work 
of Isaac Newton. In 1849 he applied for and was appointed to the professorship in 
mathematics at Queen’s College, Cork, despite having no university degree. In 1847, 
Boole published a small book, The Mathematical Analysis of Logic, and seven years 
later expanded it into An Investigation of the Laws of Thought. In these books, Boole 
introduced what is now called Boolean algebra as part of his aim to “investigate the 
fundamental laws of those operations of the mind by which reasoning is performed; 
to give expression to them in the symbolical language of a Calculus, and upon this 
foundation to establish the science of Logic and construct its method.” In addition 
to his work on logic, Boole wrote texts on differential equations and on difference 
equations that were used in Great Britain until the end of the nineteenth century. 

William Burnside (1852-1927), born in London, graduated from Cambridge in 1875, 
and remained there as lecturer until 1885. He then went to the Royal Naval College 
at Greenwich, where he stayed until he retired. Although he published much in 
applied mathematics, probability, and elliptic functions, he is best known for his 
extensive work in group theory (including the classic book Theory of Groups). His 
conjecture that groups of odd order are solvable was proved by Walter Feit and John 
Thompson and published in 1963. 

Georg Ferdinand Ludwig Philip Cantor (1845-1918) was born in Russia to Danish 
parents, received a Ph.D. in number theory in 1867 at the University of Berlin, and 
in 1869 took a position at Halle University, where he remained until his retirement. 
He is regarded as a founder of set theory. He was interested in theology and the 
nature of the infinite. His work on the convergence of Fourier series led to his study 
of certain types of infinite sets of real numbers, and ultimately to an investigation 
of transfinite numbers. 

Augustin-Louis Cauchy (1789-1857) the most prolific mathematician of the nine- 
teenth century, is most famous for his textbooks in analysis written in the 1820s for 
use at the Ecole Poly technique, textbooks which became the model for calculus texts 
for the next hundred years. Although born in the year the French Revolution began, 
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Cauchy was a staunch conservative. When the July Revolution of 1830 led to the 
overthrow of the last Bourbon king, Cauchy refused to take the oath of allegiance to 
the new king and went into a self-imposed exile in Italy and then in Prague. He did 
not return to his teaching posts until the Revolution of 1848 led to the removal of 
the requirement of an oath of allegiance. Among the many mathematical subjects 
to which he contributed besides calculus were the theory of matrices, in which he 
demonstrated that every symmetric matrix can be diagonalized by use of an orthog- 
onal substitution, and the theory of permutations, in which he was the earliest to 
consider these from a functional point of view. In fact, he used a single letter, say S, 
to denote a permutation and S' -1 to denote its inverse and then noted that the 
powers S, S 2 , S 3 , ... of a given permutation on a finite set must ultimately result 
in the identity. He also introduced the current notation (aia 2 . . . a n ) to denote the 
cyclic permutation on the letters a±, a ?, . . . , a n . 

Arthur Cayley (1821-1895), although graduating from Trinity College, Cambridge 
as Senior Wrangler, became a lawyer because there were no suitable mathematics 
positions available at that time in England. He produced nearly 300 mathematical 
papers during his fourteen years as a lawyer, and in 1863 was named Sadlerian profes- 
sor of mathematics at Cambridge. Among his numerous mathematical achievements 
are the earliest abstract definition of a group in 1854, out of which he was able to 
calculate all possible groups of order up to eight, and the basic rules for operating 
with matrices, including a statement (without proof) of the Cayley-Hamilton theo- 
rem that every matrix satisfies its characteristic equation. Cayley also developed the 
mathematical theory of trees in an article in 1857. In particular, he dealt with the 
notion of a rooted tree , a tree with a designated vertex called a root, and developed 
a recursive formula for determining the number of different rooted trees in terms of 
its branches (edges). In 1874, Cayley applied his results on trees to the study of 
chemical isomers. 

Pafnuty Lvovich Chebyshev (1821-1894) was a Russian who received his master’s 
degree in 1846 from Moscow University. From 1860 until 1882 he was a professor at 
the University of St. Petersburg. His mathematical research in number theory dealt 
with congruences and the distribution of primes; he also studied the approximation 
of functions by polynomials. 

Avram Noam Chomsky (born 1928) received a Ph.D. in linguistics at the University 
of Pennsylvania. For many years he has been a professor of foreign languages and 
linguistics at M.I.T. He has made many contributions to the study of linguistics 
and the study of grammars. 

Chrysippus (280-206 B.C.E.) was a Stoic philosopher who developed some of the ba- 
sic principles of the propositional logic, which ultimately replaced Aristotle’s logic of 
syllogisms. He was born in Cilicia, in what is now Turkey, but spent most of his life 
in Athens, and is said to have authored more than 700 treatises. Among his other 
achievements, Chrysippus analyzed the rules of inference in the propositional calcu- 
lus, including the rules of modus ponens, modus tollens, the hypothetical syllogism, 
and the alternative syllogism. 

Alonzo Church (1903-1995) studied under Hilbert at Gottingen, was on the faculty 
at Princeton from 1927 until 1967, and then held a faculty position at UCLA. He 
is a founding member of the Association for Symbolic Logic. He made many con- 
tributions in various areas of logic and the theory of algorithms, and stated the 
Church- Turing thesis (if a problem can be solved with an effective algorithm, then 
the problem can be solved by a Turing machine) . 
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George Dantzig (born 1914) is an American mathematician who formulated the gen- 
eral linear programming problem of maximizing a linear objective function subject 
to several linear constraints and developed the simplex method of solution in 1947. 
His study of linear programming grew out of his World War II service as a mem- 
ber of Air Force Project SCOOP (Scientific Computation of Optimum Programs), 
a project chiefly concerned with resource allocation problems. After the war, linear 
programming was applied to numerous problems, especially military and economic 
ones, but it was not until such problems could be solved on a computer that the real 
impact of their solution could be felt. The first successful solution of a major linear 
programming problem on a computer took place in 1952 at the National Bureau of 
Standards. After he left the Air Force, Dantzig worked for the Rand Corporation 
and then served as a professor of operations research at Stanford University. 

Richard Dedekind (1831-1916) was born in Brunswick, in northern Germany, and 
received a doctorate in mathematics at Gottingen under Gauss. He held positions 
at Gottingen and in Zurich before returning to the Polytechnikum in Brunswick. 
Although at various times he could have received an appointment to a major Ger- 
man university, he chose to remain in his home town where he felt he had sufficient 
freedom to pursue his mathematical research. Among his many contributions was 
his invention of the concept of ideals to resolve the problem of the lack of unique 
factorization in rings of algebraic integers. Even though the rings of integers them- 
selves did not possess unique factorization, Dedekind showed that every ideal is either 
prime or uniquely expressible as the product of prime ideals. Dedekind published 
this theory as a supplement to the second edition (1871) of Dirichlet’s Vorlesungen 
iiber Zahlentheorie , of which he was the editor. In the supplement, he also gave one 
of the first definitions of a field, confining this concept to subsets of the complex 
numbers. 

Abraham deMoivre (1667 1754) was born into a Protestant family in Vitry, France, 
a town about 100 miles east of Paris, and studied in Protestant schools up to the age 
of 14. Soon after the revocation of the Edict of Nantes in 1685 made life very difficult 
for Protestants in France, however, he was imprisoned for two years. He then left 
France for England, never to return. Although he was elected to the Royal Society 
in 1697, in recognition of a paper on “A method of raising an infinite Multinomial 
to any given Power or extracting any given Root of the same”, he never achieved a 
university position. He made his living by tutoring and by solving problems arising 
from games of chance and annuities for gamblers and speculators. DeMoivre ’s major 
mathematical work was The Doctrine of Chances (1718, 1736, 1756), in which he 
devised methods for calculating probabilities by use of binomial coefficients. In 
particular, he derived the normal approximation to the binomial distribution and, 
in essence, invented the notion of the standard deviation. 

Augustus DeMorgan (1806-1871) graduated from Trinity College, Cambridge in 
1827. He was the first mathematics professor at University College in London, where 
he remained on the faculty for 30 years. He founded the London Mathematical Soci- 
ety. He wrote over 1000 articles and textbooks in probability, calculus, algebra, set 
theory, and logic (including DeMorgan’s laws, an abstraction of the duality principle 
for sets). He gave a precise definition of limit, developed tests for convergence of 
infinite series, and gave a clear explanation of the Principle of Mathematical Induc- 
tion. 

Rene Descartes (1596-1650) left school at 16 and went to Paris, where he studied 
mathematics for two years. In 1616 he earned a law degree at the University of 
Poitiers. In 1617 he enlisted in the army and traveled through Europe until 1629, 
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when he settled in Holland for the next 20 years. During this productive period of 
his life he wrote on mathematics and philosophy, attempting to reduce the sciences 
to mathematics. In 1637 his Discours was published; this book contained the devel- 
opment of analytic geometry. In 1649 he has invited to tutor the Queen Christina 
of Sweden in philosophy. There he soon died of pneumonia. 

Leonard Eugene Dickson (1874-1954) was born in Iowa and in 1896 received the 
first Ph.D. in mathematics given by the University of Chicago, where he spent much 
of his faculty career. His research interests included abstract algebra (including the 
study of matrix groups and finite fields) and number theory. 

Diophantus (c. 250) was an Alexandrian mathematician about whose life little is 
known except what is reported in an epigram of the Greek Anthology (c. 500), from 
which it can calculated that he lived to the age of 84. His major work, however, 
the Arithmetica , has been extremely influential. Despite its title, this is a book on 
algebra, consisting mostly of an organized collection of problems translatable into 
what are today called indeterminate equations, all to be solved in rational numbers. 
Diophantus introduced the use of symbolism into algebra and outlined the basic rules 
for operating with algebraic expressions, including those involving subtraction. It 
was in a note appended to Problem II-8 of the 1621 Latin edition of the Arithmetic a 
- to divide a given square number into two squares — that Pierre de Fermat first 
asserted the impossibility of dividing an nth power (n > 2) into the sum of two nth 
powers. This result, now known as Fermat’s Last Theorem, was finally proved in 
1994 by Andrew Wiles. 

Charles Lutwidge Dodgson (1832-1898) is more familiarly known as Lewis Carroll, 
the pseudonym he used in writing his famous children’s works Alice in Wonderland 
and Through the Looking Glass. Dodgson graduated from Oxford University in 1854 
and the next year was appointed a lecturer in mathematics at Christ Church College, 
Oxford. Although he was not successful as a lecturer, he did contribute to four 
areas of mathematics: determinants, geometry, the mathematics of tournaments and 
elections, and recreational logic. In geometry, he wrote a five-act comedy, “Euclid 
and His Modern Rivals” , about a mathematics lecturer Minos in whose dreams Euclid 
debates his Elements with various modernizers but always manages to demolish the 
opposition. He is better known, however, for his two books on logic, Symbolic 
Logic and The Game of Logic. In the first, he developed a symbolical calculus for 
analyzing logical arguments and wrote many humorous exercises designed to teach 
his methods, while in the second, he demonstrated a game which featured various 
forms of the syllogism. 

Eratosthenes (276-194 B.C.E) was born in Cyrene (North Africa) and studied at 
Plato’s Academy in Athens. He was tutor of the son of King Ptolemy III Euergetes 
in Alexandria and became chief librarian at Alexandria. He is recognized as the 
foremost scholar of his time and wrote in many areas, including number theory (his 
sieve for obtaining primes) and geometry. He introduced the concepts of meridians 
of longitude and parallels of latitude and used these to measure distances, including 
an estimation of the circumference of the earth. 

Paul Erdos (1913-1996) was born in Budapest. At 21 he received a Ph.D. in math- 
ematics from Eotvos University. After leaving Hungary in 1934, he traveled exten- 
sively throughout the world, with very few possessions and no permanent home, 
working with other mathematicians in combinatorics, graph theory, number theory, 
and many other areas. He was author or coauthor of approximately 1500 papers 
with 500 coauthors. 
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Euclid (c. 300 B.C.E.) is responsible for the most famous mathematics text of all time, 
the Elements. Not only does this work deal with the standard results of plane 
geometry, but it also contains three chapters on number theory, one long chapter 
on irrational quantities, and three chapters on solid geometry, culminating with the 
construction of the five regular solids. The axiom-definition-theorem-proof style of 
Euclid’s work has become the standard for formal mathematical writing up to the 
present day. But about Euclid’s life virtually nothing is known. It is, however, 
generally assumed that he was among the first mathematicians at the Museum and 
Library of Alexandria, which was founded around 300 B.C.E by Ptolemy I Soter, 
the Macedonian general of Alexander the Great who became ruler of Egypt after 
Alexander’s death in 323 B.C.E. 

Leonhard Euler (1707-1783) was born in Basel, Switzerland and became one of the 
earliest members of the St. Petersburg Academy of Sciences. He was the most pro- 
lific mathematician of all time, making contributions to virtually every area of the 
subject. His series of analysis texts established many of the notations and methods 
still in use today. He created the calculus of variations and established the theory of 
surfaces in differential geometry. His study of the Konigsberg bridge problem led to 
the formulation and solution of one of the first problems in graph theory. He made 
numerous discoveries in number theory, including a detailed study of the properties 
of residues of powers and the first statement of the quadratic reciprocity theorem. 
He developed an algebraic formula for determining the number of partitions of an 
integer n into to distinct parts, each of which is in a given set A of distinct positive 
integers. And in a paper of 1782, he even posed the problem of the existence of a 
pair of orthogonal latin squares: If there are 36 officers, one of each of six ranks from 
each of six different regiments, can they be arranged in a square in such a way that 
each row and column contains exactly one officer of each rank and one from each 
regiment? 

Kamal al-Din al-Farisi (died 1320) was a Persian mathematician most famous for his 
work in optics. In fact, he wrote a detailed commentary on the great optical work of 
Ibn al-Haytham. But al-Farisi also made major contributions to number theory. He 
produced a detailed study of the properties of amicable numbers (pairs of numbers 
in which the sum of the proper divisors of each is equal to the other). As part of this 
study, al-Farisi developed and applied various combinatorial principles. He showed 
that the classical figurate numbers (triangular, pyramidal, etc.) could be interpreted 
as numbers of combinations and thus helped to found the theory of combinatorics 
on a more abstract basis. 

Pierre de Fermat (1601-1665) was a lawyer and magistrate for whom mathematics 
was a pastime that led to contributions in many areas: calculus, number theory, 
analytic geometry, and probability theory. He received a bachelor’s degree in civil 
law in 1631, and from 1648 until 1665 was King’s Counsellor. He suffered an attack 
of the plague in 1652, and from then on he began to devote time to the study 
of mathematics. He helped give a mathematical basis to probability theory when, 
together with Blaise Pascal, he solved Mere’s paradox: why is it less likely to roll a 6 
at least once in four tosses of one die than to roll a double 6 in 24 tosses of two dice. 
He was a discoverer of analytic geometry and used infinitesimals to find tangent 
lines and determine maximum and minimum values of curves. In 1657 he published 
a series of mathematical challenges, including the conjecture that x n + y n = z n has 
no solution in positive integers if n is an integer greater than 2. He wrote in the 
margin of a book that he had a proof, but the proof would not fit in the margin. His 
conjecture was finally proved by Andrew Wiles in 1994. 
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Fibonacci ( Leonardo of Pisa) (c. 1175-c. 1250) was the son of a Mediterranean mer- 
chant and government worker named Bonaccio (hence his name Glius Bonaccio, “son 
of Bonaccio”). Fibonacci, born in Pisa and educated in Bougie (on the north coast 
of Africa where his father was administrator of Pisa’s trading post), traveled exten- 
sively around the Mediterranean. He is regarded as the greatest mathematician of 
the Middle Ages. In 1202 he wrote the book Liber Abaci, an extensive treatment 
of topics in arithmetic and algebra, and emphasized the benefits of Arabic numerals 
(which he knew about as a result of his travels around the Mediterranean) . In this 
book he also discussed the rabbit problem that led to the sequence that bears his 
name: 1, 1, 2, 3, 5, 8, 13, .... In 1225 he wrote the book Liber Quadratorum, studying 
second degree diophantine equations. 

Joseph Fourier (1768-1830), orphaned at the age of 9, was educated in the military 
school of his home town of Auxerre, 90 miles southeast of Paris. Although he hoped 
to become an army engineer, such a career was not available to him at the time 
because he was not of noble birth. He therefore took up a teaching position. Dur- 
ing the Revolution, he was outspoken in defense of victims of the Terror of 1794. 
Although he was arrested, he was released after the death of Robespierre and was 
appointed in 1795 to a position at the Ecole Poly technique. After serving in various 
administrative posts under Napoleon, he was elected to the Academie cles Sciences 
and from 1822 until his death served as its perpetual secretary. It was in connection 
with his work on heat diffusion, detailed in his Analytic Theory of Heat of 1822, 

r \ a2 a2 

and, in particular, with his solution of the heat equation ^ that he 

developed the concept of a Fourier series. Fourier also analyzed the relationship 
between the series solution of a partial differential equation and an appropriate inte- 
gral representation and thereby initiated the study of Fourier integrals and Fourier 
transforms. 

Georg Frobenius (1849-1917) organized and analyzed the central ideas of the theory of 
matrices in his 1878 memoir “On linear substitutions and bilinear forms” . Frobenius 
there defined the general notion of equivalent matrices. He also dealt with the 
special cases of congruent and similar matrices. Frobenius showed that when two 
symmetric matrices were similar, the transforming matrix could be taken to be 
orthogonal, one whose inverse equaled its transpose. He then made a detailed study 
of orthogonal matrices and showed that their eigenvalues were complex numbers 
of absolute value 1. He also gave the first complete proof of the Cayley-Hamilton 
theorem that a matrix satisfies its characteristic equation. Frobenius, a full professor 
in Zurich and later in Berlin, made his major mathematical contribution in the area 
of group theory. He was instrumental in developing the concept of an abstract group, 
as well as in investigating the theory of finite matrix groups and group characters. 

Evariste Galois (1811-1832) led a brief, tragic life which ended in a duel fought under 
mysterious circumstances. He was born in Bourg-la-Reine, a town near Paris. He 
developed his mathematical talents early and submitted a memoir on the solvabil- 
ity of equations of prime degree to the French Academy in 1829. Unfortunately, 
the referees were never able to understand this memoir nor his revised version sub- 
mitted in 1831. Meanwhile, Galois became involved in the revolutionary activities 
surrounding the July revolution of 1830 and was arrested for threatening the life 
of King Louis-Phillipe and then for wearing the uniform of a National Guard divi- 
sion which had been dissolved because of its perceived threat to the throne. His 
mathematics was not fully understood until fifteen years after his death when his 
manuscripts were finally published by Liouville in the Journal des mathematique . 
But Galois had in fact shown the relationship between subgroups of the group of 
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permutations of the roots of a polynomial equation and the various extension fields 
generated by these roots, the relationship at the basis of what is now known as Galois 
theory. Galois also developed the notion of a finite field in connection with solving 
the problem of finding solutions to congruences F(x) = 0 (mod p), where F(x) is a 
polynomial of degree n and no residue modulo the prime p is itself a solution. 

Carl Friedrich Gauss (1777 -1855), often referred to as the greatest mathematician 
who ever lived, was born in Brunswick, Germany. He received a Ph.D. from the 
University of Helmstedt in 1799, proving the Fundamental Theorem of Algebra as 
part of his dissertation. At age 24 Gauss published his important work on number 
theory, the Disquisitiones Arithmetical, a work containing not only an extensive 
discussion of the theory of congruences, culminating in the quadratic reciprocity 
theorem, but also a detailed treatment of cyclotomic equations in which he showed 
how to construct regular n-gons by Euclidean techniques whenever n is prime and 
n— 1 is a power of 2. Gauss also made fundamental contributions to the differential 
geometry of surfaces as well as to complex analysis, astronomy, geodesy, and statistics 
during his long tenure as a professor at the University of Gottingen. It was in 
connection with using the method of least squares to solve an astronomical problem 
that Gauss devised the systematic procedure for solving a system of linear equations 
today known as Gaussian elimination. (Unknown to Gauss, the method appeared in 
Chinese mathematics texts 1800 years earlier.) Gauss’ notebooks, discovered after 
his death, contained investigations in numerous areas of mathematics in which he 
did not publish, including the basics of non-Euclidean geometry. 

Sophie Germain (1776-1831) was forced to study in private due to the turmoil of 
the French Revolution and the opposition of her parents. She nevertheless mas- 
tered mathematics through calculus and wanted to continue her study in the Ecole 
Polytechnique when it opened in 1794. But because women were not admitted as 
students, she diligently collected and studied the lecture notes from various mathe- 
matics classes and, a few years later, began a correspondence with Gauss (under the 
pseudonym Monsieur LeBlanc, fearing that Gauss would not be willing to recognize 
the work of a woman) on ideas in number theory. She was, in fact, responsible for 
suggesting to the French general leading the army occupying Brunswick in 1807 that 
he insure Gauss’ safety. Germain’s chief mathematical contribution was in connec- 
tion with Fermat’s Last Theorem. She showed that x n + y n = z n has no positive 
integer solution where xyz is not divisible by n for any odd prime n less than 100. 
She also made contributions in the theory of elasticity and won a prize from the 
French Academy in 1815 for an essay in this field. 

Kurt Godel (1906-1978) was an Austrian mathematician who spent most of his life at 
the Institute for Advanced Study in Princeton. He made several surprising contribu- 
tions to set theory, demonstrating that Hilbert’s goal of showing that a reasonable 
axiomatic system for set theory could be proven to be complete and consistent was in 
fact impossible. In several seminal papers published in the 1930s, Godel proved that 
it was impossible to prove internally the consistency of the axioms of any reasonable 
system of set theory containing the axioms for the natural numbers. Furthermore, 
he showed that any such system was inherently incomplete, that is, that there are 
propositions expressible in the system for which neither they nor their negations are 
provable. Godel’s investigations were stimulated by the problems surrounding the 
axiom of choice, the axiom that for any set S of nonempty disjoint sets, there is 
a subset T of the union of S that has exactly one element in common with each 
member of S. Since that axiom led to many counterintuitive results, it was impor- 
tant to show that the axiom could not lead to contradictions. But given his initial 
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results, the best Godel could do was to show that the axiom of choice was relatively 
consistent, that its addition to the Zermelo-Fraenkel axiom set did not lead to any 
contradictions that would not already have been implied without it. 

William Rowan Hamilton (1805-1865), born in Dublin, was a child prodigy who 
became the Astronomer Royal of Ireland in 1827 in recognition of original work 
in optics accomplished during his undergraduate years at Trinity College, Dublin. 
In 1837, he showed how to introduce complex numbers into algebra axiomatically 
by considering a + ib as a pair (a, b ) of real numbers with appropriate computational 
rules. After many years of seeking an appropriate definition for multiplication rules 
for triples of numbers which could be applied to vector analysis in 3-dimensional 
space, he discovered that it was in fact necessary to consider quadruplets of numbers, 
which Hamilton named quaternions. Although quaternions never had the influence 
Hamilton forecast for them in physics, their noncommutative multiplication provided 
the first significant example of a mathematical system which did not obey one of the 
standard arithmetical laws of operation and thus opened the way for more “freedom” 
in the creation of mathematical systems. Among Hamilton’s other contributions was 
the development of the Icosian game, a graph with 20 vertices on which pieces were 
to be placed in accordance with various conditions, the overriding one being that a 
piece was always placed at the second vertex of an edge on which the previous piece 
had been placed. One of the problems Hamilton set for the game was, in essence, to 
discover a cyclic path on his game board which passed through each vertex exactly 
once. Such a path in a more general setting is today called a Hamilton circuit. 

Richard W. Hamming (1915-1998) was born in Chicago and received a Ph.D. in 
mathematics from the University of Illinois in 1942. He was the author of the first 
major paper on error correcting and detecting codes (1950). His work on this problem 
had been stimulated in 1947 when he was using an early Bell System relay computer 
on weekends only. During the weekends the machine was unattended and would 
dump any work in which it discovered an error and proceed to the next problem. 
Hamming realized that it would be worthwhile for the machine to be able not only 
to detect an error but also to correct it, so that his jobs would in fact be completed. 
In his paper, Hamming used a geometric model by considering an ?r-digit code word 
to be a vertex in the unit cube in the ?r-dimensional vector space over the field of 
two elements. He was then able to show that the relationship between the word 
length n and the number m of digits which carry the information was 2 m < ^ry. 
(The remaining k = n — m digits are check digits which enable errors to be detected 
and corrected.) In particular, Hamming presented a particular type of code, today 
known as a Hamming code, with n = 7 and m = 4. In this code, the set of actual 
code words of 4 digits was a 4-dimensional vector subspace of the 7-dimensional 
space of all 7-digit binary strings. 

Godfrey Harold Hardy (1877-1947) graduated from Trinity College, Cambridge in 
1899. From 1906 until 1919 he was lecturer at Trinity College, and, recognizing the 
genius of Ramanujan, invited Ramanujan to Cambridge in 1914. Hardy held the 
Sullivan chair of geometry at Oxford from 1919 until 1931, when he returned to 
Cambridge, where he was Sadlerian professor of pure mathematics until 1942. He 
developed the Hardy- Weinberg law which predicts patterns of inheritance. His main 
areas of mathematical research were analysis and number theory, and he published 
over 100 joint papers with Cambridge colleague John Littlewood. Hardy’s book A 
Course in Pure Mathematics revolutionized mathematics teaching, and his book A 
Mathematician ’s Apology gives his view of what mathematics is and the value of its 
study. 
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Abu ’Ali al-Hasan ibn al-Haytham ( Alhazen ) (965-1039) was one of the most 
influential of Islamic scientists. He was born in Basra (now in Iraq) but spent most 
of his life in Egypt, after he was invited to work on a Nile control project. Although 
the project, an early version of the Aswan dam project, never came to fruition, ibn 
al-Haytham did produce in Egypt his most important scientific work, the Optics. 
This work was translated into Latin in the early thirteenth century and was studied 
and commented on in Europe for several centuries thereafter. Although there was 
much mathematics in the Optics, ibn al-Haytham’s most interesting mathematical 
work was the development of a recursive procedure for producing formulas for the 
sum of any integral powers of the integers. Formulas for the sums of the integers, 
squares, and cubes had long been known, but ibn al-Haytham gave a consistent 
method for deriving these and used this to develop the formula for the sum of fourth 
powers. Although his method was easily generalizable to the discovery of formulas 
for fifth and higher powers, he gave none, probably because he only needed the fourth 
power rule in his computation of the volume of a paraboloid of revolution. 

Hypatia (c. 370-415), the first woman mathematician on record, lived in Alexandria. 
She was given a very thorough education in mathematics and philosophy by her 
father Theon and became a popular and respected teacher. She was responsible for 
detailed commentaries on several important Greek works, including Ptolemy’s Al- 
magest, Apollonius’ Conics, and Diophantus’ Arithmetics. Unfortunately, Hypatia 
was caught up in the pagan-Christian turmoil of her times and was murdered by an 
enraged mob. 

Leonid Kantorovich (1912-1986) was a Soviet economist responsible for the develop- 
ment of linear optimization techniques in relation to planning in the Soviet economy. 
The starting point of this development was a set of problems posed by the Leningrad 
timber trust at the beginning of 1938 to the Mathematics Faculty at the University 
of Leningrad. Kantorovich explored these problems in his 1939 book Mathematical 
Methods in the Organization and Planning of Production. He believed that one 
way to increase productivity in a factory or an entire industrial organization was 
to improve the distribution of the work among individual machines, the orders to 
various suppliers, the different kinds of raw materials, the different types of fuels, 
and so on. He was the first to recognize that these problems could all be put into the 
same mathematical language and that the resulting mathematical problems could 
be solved numerically, but for various reasons his work was not pursued by Soviet 
economists or mathematicians. 

Abu Bakr al-Karaji (died 1019) was an Islamic mathematician who worked in Bagh- 
dad. In the first decade of the eleventh century he composed a major work on 
algebra entitled al-Fakhri ( The Marvelous ), in which he developed many algebraic 
techniques, including the laws of exponents and the algebra of polynomials, with the 
aim of systematizing methods for solving equations. He was also one of the early 
originators of a form of mathematical induction, which was best expressed in his 
proof of the formula for the sum of integral cubes. 

Stephen Cole Kleene (1909-1994) studied under Alonzo Church and received his 
Ph.D. from Princeton in 1934. His research has included the study of recursive func- 
tions, computability, decidability, and automata theory. In 1956 he proved Kleene’s 
Theorem, in which he characterized the sets that can be recognized by finite-state 
automata. 

Felix Klein (1849-1925) received his doctorate at the University of Bonn in 1868. 
In 1872 he was appointed to a position at the University of Erlanger, and in his 
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opening address laid out the Erlanger Programm for the study of geometry based on 
the structure of groups. He described different geometries in terms of the properties 
of a set that are invariant under a group of transformations on the set and gave 
a program of study using this definition. From 1875 until 1880 he taught at the 
Technische Hochschule in Munich, and from 1880 until 1886 in Leipzig. In 1886 
Klein became head of the mathematics department at Gottingen and during his 
tenure raised the prestige of the institution greatly. 

Donald E. Knuth (born 1938) received a Ph.D. in 1963 from the California Institute 
of Technology and held faculty positions at the California Institute of Technology 
(1963-1968) and Stanford (1968-1992). He has made contributions in many areas, 
including the study of compilers and computational complexity. He is the designer 
of the mathematical typesetting system Tj^X. He received the Turing Award in 1974 
and the National Medal of Technology in 1979. 

Kazimierz Kuratowski (1896-1980) was the son of a famous Warsaw lawyer who be- 
came an active member of the Warsaw School of Mathematics after World War I. He 
taught both at Lwow Polytechnical University and at Warsaw University until the 
outbreak of World War II. During that war, because of the persecution of educated 
Poles, he went into hiding under an assumed name and taught at the clandestine 
Warsaw University. After the war, he helped to revive Polish mathematics, serving 
as director of the Polish National Mathematics Institute. His major mathemati- 
cal contributions were in topology; he formulated a version of a maximal principle 
equivalent to the axiom of choice. This principle is today known as Zorn’s lemma. 
Kuratowski also contributed to the theory of graphs by proving in 1930 that any 
non-planar graph must contain a copy of one of two particularly simple non-planar 
graphs. 

Joseph Louis Lagrange (1736-1813) was born in Turin into a family of French de- 
scent. He was attracted to mathematics in school and at the age of 19 became a 
mathematics professor at the Royal Artillery School in Turin. At about the same 
time, having read a paper of Euler’s on the calculus of variations, he wrote to Eu- 
ler explaining a better method he had recently discovered. Euler praised Lagrange 
and arranged to present his paper to the Berlin Academy, to which he was later 
appointed when Euler returned to Russia. Although most famous for his Analytical 
Mechanics , a work which demonstrated how problems in mechanics can generally be 
reduced to solutions of ordinary or partial differential equations, and for his Theory 
of Analytic Functions, which attempted to reduce the ideas of calculus to those of 
algebraic analysis, he also made contributions in other areas. For example, he un- 
dertook a detailed review of solutions to quadratic, cubic, and quartic polynomials 
to see how these methods might generalize to higher degree polynomials. He was led 
to consider permutations on the roots of the equations and functions on the roots 
left unchanged by such permutations. As part of this work, he discovered a version 
of Lagrange’s theorem to the effect that the order of any subgroup of a group divides 
the order of the group. Although he did not complete his program and produce a 
method of solving higher degree polynomial equations, his methods were applied by 
others early in the nineteenth century to show that such solutions were impossible. 

Gabriel Lame (1795-1870) was educated at the Ecole Polytechnique and the Ecole 
des Mines before going to Russia to direct the School of Highways and Transporta- 
tion in St. Petersburg. After his return to France in 1832, he taught at the Ecole 
Polytechnique while also working as an engineering consultant. Lame contributed 
original work to number theory, applied mathematics, and thermodynamics. His 
best-known work is his proof of the case n = 5 of Fermat’s Last Theorem in 1839. 
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Eight years later, he announced that he had found a general proof of the theorem, 
which began with the factorization of the expression x n + y n over the complex num- 
bers as (x + y)(x + ay)(x + a 2 y) ... (x + a n ~ 1 y ), where a is a primitive root of 
x n — 1 = 0. He planned to show that the factors in this expression are all relatively 
prime and therefore that if x n + y n = z n , then each of the factors would itself be an 
nth power. He would then use the technique of infinite descent to find a solution in 
smaller numbers. Unfortunately Lame’s idea required that the ring of integers in the 
cyclotomic field of the nth roots of unity be a unique factorization domain. And, as 
Kummer had already proved three years earlier, unique factorization in fact fails in 
many such domains. 

Edmund Landau (1877-1938) received a doctorate under Frobenius and taught at 
the University of Berlin and at Gottingen. His research areas were analysis and 
analytic number theory, including the distribution of primes. He used the big-O 
notation (also called a Landau symbol) in his work to estimate the growth of various 
functions. 

Pierre-Simon de Laplace (1749-1827) entered the University of Caen in 1766 to 
begin preparation for a career in the church. He soon discovered his mathematical 
talents, however, and in 1768 left for Paris to continue his studies. He later taught 
mathematics at the Ecole Militaire to aspiring cadets. Legend has it that he exam- 
ined, and passed, Napoleon there in 1785. He was later honored by both Napoleon 
and King Louis XVIII. Laplace is best known for his contributions to celestial me- 
chanics, but he was also one of the founders of probability theory and made many 
contributions to mathematical statistics. In fact, he was one of the first to apply his 
theoretical results in statistics to a genuine problem in statistical inference, when 
he showed from the surplus of male to female births in Paris over a 25-year period 
that it was “morally certain” that the probability of a male birth was in fact greater 
than 

Gottfried Wilhelm Leibniz (1646-1716), born in Leipzig, developed his version of 
the calculus some ten years after Isaac Newton, but published it much earlier. He 
based his calculus on the inverse relationship of sums and differences, generalized 
to infinitesimal quantities called differentials. Leibniz hoped that his most origi- 
nal contribution to philosophy would be the development of an alphabet of human 
thought, a way of representing all fundamental concepts symbolically and a method 
of combining these symbols to represent more complex thoughts. Although he never 
completed this project, his interest in finding appropriate symbols ultimately led 
him to the d and f symbols for the calculus that are used today. Leibniz spent much 
of his life in the diplomatic service of the Elector of Mainz and later was a Counsel- 
lor to the Duke of Hanover. But he always found time to pursue his mathematical 
ideas and to carry on a lively correspondence on the subject with colleagues all over 
Europe. 

Levi ben Gerson (1288-1344) was a rabbi as well as an astronomer, philosopher, 
biblical commentator, and mathematician. He lived in Orange, in southern France, 
but little is known of his life. His most famous mathematical work is the Maasei 
Hoshev (The Art of the Calculator) (1321), which contains detailed proofs of the 
standard combinatorial formulas, some of which use the principle of mathematical 
induction. About a dozen copies of this medieval manuscript are extant, but it is 
not known whether the work had any direct influence elsewhere in Europe. 

Augusta Ada Byron King Lovelace (1815-1852) was the child of the famous poet 
George Gordon, the sixth Lord Byron, who left England five weeks after his daugh- 
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ter’s birth and never saw her again. She was raised by her mother, Anna Isabella 
Millbanke, a student of mathematics herself, so she received considerably more math- 
ematics education than was usual for girls of her time. She was tutored privately by 
well-known mathematicians, including William Frend and Augustus DeMorgan. Her 
husband, the Earl of Lovelace, was made a Fellow of the Royal Society in 1840, and 
through this connection, Ada was able to gain access to the books and papers she 
needed to continue her mathematical studies and, in particular, to understand the 
workings of Babbage’s Analytical Engine. Her major mathematical work is a heav- 
ily annotated translation of a paper by the Italian mathematician L. F. Menabrea 
dealing with the Engine, in which she gave explicit descriptions of how it would 
solve specific problems and described, for the first time in print, what would today 
be called a computer program, in this case a program for computing the Bernoulli 
numbers. Interestingly, only her initials, A.A.L., were used in the published ver- 
sion of the paper. It was evidently not considered proper in mid-nineteenth century 
England for a woman of her class to publish a mathematical work. 

Jan Lukasiewicz (1878-1956) studied at the University of Lwow and taught at the 
University of Lwow, the University of Warsaw, and the Royal Irish Academy. A 
logician, he worked in the area of many- valued logic, writing papers on three- valued 
and to- valued logics, He is best known for the parenthesis-free notation he developed 
for propositions, called Polish notation. 

Percy Alexander MacMahon (1854-1929) was born into a British army family and 
joined the army himself in 1871, reaching the rank of major in 1889. Much of 
his army service was spent as an instructor at the Royal Military Academy. His 
early mathematical work dealt with invariants, following on the work of Cayley 
and Sylvester, but a study of symmetric functions eventually led to his interest 
in partitions and to his extension of the idea of a partition to higher dimensions. 
MacMahon’s two volume treatise Combinatorial Analysis (1915-16) is a classic in 
the field. It identified and clarified the basic results of combinatorics and showed 
the way toward numerous applications. 

Mahavira (ninth century) was an Indian mathematician of the medieval period whose 
major work, the Ganitasarasangraha, was a compilation of problems solvable by var- 
ious algebraic techniques. For example, the work included a version of the hundred 
fowls problem: “Doves are sold at the rate of 5 for 3 coins, cranes at the rate of 7 
for 5, swans at the rate of 9 for 7, and peacocks at the rate of 3 for 9. A certain man 
was told to bring at these rates 100 birds for 100 coins for the amusement of the 
king’s son and was sent to do so. What amount does he give for each?” Mahavira 
also presented, without proof and in words, the rule for calculating the number of 
combinations of r objects out of a set of n. His algorithm can be easily translated into 
the standard formula. Mahavira then applied the rule to two problems, one about 
combinations of tastes and another about combinations of jewels on a necklace. 

Andrei Markov (1856-1922) was a Russian mathematician who first defined what 
are now called Markov chains in a paper of 1906 dealing with the Law of Large 
Numbers and subsequently proved many of the standard results about them. His 
interest in these chains stemmed from the needs of probability theory. Markov never 
dealt with their application to the sciences, only considering examples from literary 
texts, where the two possible states in the chain were vowels and consonants. Markov 
taught at St. Petersburg University from 1880 to 1905 and contributed to such fields 
as number theory, continued fractions, and approximation theory. He was an active 
participant in the liberal movement in pre- World War I Russia and often criticized 
publicly the actions of state authorities. In 1913, when as a member of the Academy 
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of Sciences he was asked to participate in the pompous ceremonies celebrating the 
300th anniversary of the Romanov dynasty, he instead organized a celebration of the 
200th anniversary of Jacob Bernoulli’s publication of the Law of Large Numbers. 

Marin Mersenne (1588-1648) was educated in Jesuit schools and in 1611 joined the 
Order of Minims. From 1619 he lived in the Minim Convent de l’Annonciade near the 
Place Royale in Paris and there held regular meetings of a group of mathematicians 
and scientists to discuss the latest ideas. Mersenne also served as the unofficial 
“secretary” of the republic of scientific letters in Europe. As such, he received 
material from various sources, copied it, and distributed it widely, thus serving as 
a “walking scientific journal” . His own contributions were primarily in the area 
of music theory as detailed in his two great works on the subject, the Harmonie 
universelle and the Harmonicorum libri, both of which appeared in 1636. As part of 
his study of music, he developed the basic combinatorial formulas by considering the 
possible tunes one could create out of a given number of notes. Mersenne was also 
greatly interested in the relationship of theology to science. He was quite concerned 
when he learned that Galileo could not publish one of his works because of the 
Inquisition and, in fact, offered his assistance in this matter. 

Hermann Minkowski (1864-1909) was a German Jewish mathematician who received 
his doctorate at the University of Konigsberg. He became a lifelong friend of David 
Hilbert and, on Hilbert’s suggestion, was called to Gottingen in 1902. In 1883, he 
shared the prize of the Paris Academy of Sciences for his essay on the topic of the 
representations of an integer as a sum of squares. In his essay, he reconstructed 
the entire theory of quadratic forms in n variables with integral coefficients. In 
further work on number theory, he brought to bear geometric ideas beginning with 
the realization that a symmetric convex body in ?z-space defines a notion of distance 
and hence a geometry in that space. The connection with number theory depends 
on the representation of forms by lattice points in space. 

Muhammad ibn Muhammad al-Fullani al-Kishnawi (died 1741) was a native 
of northern Nigeria and one of the few African black scholars known to have made 
contributions to “pure” mathematics before the modern era. Muhammad’s most 
important work, available in an incomplete manuscript in the library of the School 
of Oriental and African Studies in London, deals with the theory of magic squares. 
He gave a clear treatment of the “standard” construction of magic squares and also 
studied several other constructions — using knight’s moves, borders added to a magic 
square of lower order, and the formation of a square from a square number of smaller 
magic squares. 

Peter Naur (born 1928) was originally an astronomer, using computers to calculate 
planetary motion. In 1959 he became a full-time computer scientist; he was a de- 
veloper of the programming language ALGOL and worked on compilers for ALGOL 
and COBOL. In 1969 he took a computer science faculty position at the University 
of Copenhagen. 

Amalie Emmy Noether (1882-1935) received her doctorate from the University of 
Erlangen in 1908 and a few years later moved to Gottingen to assist Hilbert in 
the study of general relativity. During her eighteen years there, she was extremely 
influential in stimulating a new style of thinking in algebra by always emphasizing 
its structural rather than computational aspects. In 1934 she became a professor 
at Bryn Mawr College and a member for the Institute for Advanced Study. She is 
most famous for her work on Noetherian rings, and her influence is still evident in 
today’s textbooks in abstract algebra. 
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Blaise Pascal (1623-1662) showed his mathematical precocity with his Essay on Con- 
ics of 1640, in which he stated his theorem that the opposite sides of a hexagon 
inscribed in a conic section always intersect in three collinear points. Pascal is bet- 
ter known, however, for his detailed study of what is now called Pascal’s triangle 
of binomial coefficients. In that study Pascal gave an explicit description of math- 
ematical induction and used that method, although not quite in the modern sense, 
to prove various properties of the numbers in the triangle, including a method of 
determining the appropriate division of stakes in a game interrupted before its con- 
clusion. Pascal had earlier discussed this matter, along with various other ideas in 
the theory of probability, in correspondence with Fermat in the 1650s. These letters, 
in fact, can be considered the beginning of the mathematization of probability. 

Giuseppe Peano (1858-1932) studied at the University of Turin and then spent the 
remainder of his life there as a professor of mathematics. He was originally known as 
an inspiring teacher, but as his studies turned to symbolic logic and the foundations 
of mathematics and he attempted to introduce some of these notions in his elemen- 
tary classes, his teaching reputation changed for the worse. Peano is best known 
for his axioms for the natural numbers, first proposed in the Arithmetices prin- 
cipia, nova methodo exposita of 1889. One of these axioms describes the principle 
of mathematical induction. Peano was also among the first to present an axiomatic 
description of a (finite-dimensional) vector space. In his Calcolo geometrico of 1888, 
Peano described what he called a linear system, a set of quantities provided with 
the operations of addition and scalar multiplication which satisfy the standard prop- 
erties. He was then able to give a coherent definition of the dimension of a linear 
system as the maximum number of linearly independent quantities in the system. 

Charles Sanders Peirce (1839-1914) was born in Massachusetts, the son of a Harvard 
mathematics professor. He received a master’s degree from Harvard in 1862 and an 
advanced degree in chemistry from the Lawrence Scientific School in 1863. He made 
contributions to many areas of the foundations and philosophy of mathematics. He 
was a prolific writer, leaving over 100,000 pages of unpublished manuscript at his 
death. 

George Polya (1887-1985) was a Hungarian mathematician who received his doctor- 
ate at Budapest in 1912. From 1914 to 1940 he taught in Zurich, then emigrated to 
the United States where he spent most of the rest of his professional life at Stanford 
University. Polya developed some influential enumeration ideas in several papers in 
the 1930s, in particular dealing with the counting of certain configurations that are 
not equivalent under the action of a particular permutation group. For example, 
there are 16 ways in which one can color the vertices of a square using two colors, 
but only six are non-equivalent under the various symmetries of the square. In 1937, 
Polya published a major article in the field, “Combinatorial Enumeration of Groups, 
Graphs and Chemical Compounds”, in which he discussed many mathematical as- 
pects of the theory of enumeration and applied it to various problems. Polya’s work 
on problem solving and heuristics, summarized in his two volume work Mathematics 
and Plausible Reasoning, insured his fame as a mathematics educator; his ideas are 
at the forefront of recent reforms in mathematics education at all levels. 

Qin Jiushao (1202-1261), born in Sichuan, published a general procedure for solving 
systems of linear congruences — the Chinese remainder theorem — in his Shushu 
jiuzhang ( Mathematical Treatise in Nine Sections ) in 1247, a procedure which makes 
essential use of the Euclidean algorithm. He also gave a complete description of a 
method for numerically solving polynomial equations of any degree. Qin’s method 
had been developed in China over a period of more than a thousand years; it is 
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similar to a method used in the Islamic world and is closely related to what is now 
called the Horner method of solution, published by William Horner in 1819. Qin 
studied mathematics at the Board of Astronomy, the Chinese agency responsible 
for calendrical computations. He later served the government in several offices, but 
because he was “extravagant and boastful” , he was several times relieved of his duties 
because of corruption. These firings notwithstanding, Qin became a wealthy man 
and developed an impressive reputation in love affairs. 

Srinivasa Ramanujan (1887 1920) was born near Madras into the family of a book- 
keeper. He studied mathematics on his own and soon began producing results in 
combinatorial analysis, some already known and others previously unknown. At the 
urging of friends, he sent some of his results to G. H. Hardy in England, who quickly 
recognized Ramanujan’s genius and invited him to England to develop his untrained 
mathematical talent. During the war years from 1914 to 1917, Hardy and Ramanu- 
jan collaborated on a number of papers, including several dealing with the theory 
of partitions. Unfortunately, Ramanujan fell ill during his years in the unfamiliar 
climate of England and died at age 32 soon after returning to India. Ramanujan 
left behind several notebooks containing statements of thousands of results, enough 
work to keep many mathematicians occupied for years in understanding and proving 
them. 

Frank Ramsey (1903-1930), son of the president of Magdalene College, Cambridge, 
was educated at Winchester and Trinity Colleges. He was then elected a fellow of 
King’s College, where he spent the remainder of his life. Ramsey made important 
contributions to mathematical logic. What is now called Ramsey theory began with 
his clever combinatorial arguments to prove a generalization of the pigeonhole prin- 
ciple, published in the paper “On a Problem of Formal Logic”. The problem of that 
paper was the Entscheidungsproblem (the decision problem), the problem of search- 
ing for a general method of determining the consistency of a logical formula. Ramsey 
also made contributions to the mathematical theory of economics and introduced the 
subjective interpretation to probability. In that interpretation, Ramsey argues that 
different people when presented with the same evidence, will have different degrees 
of belief. And the way to measure a person’s belief is to propose a bet and see what 
are the lowest odds the person will accept. Ramsey’s death at the age of 26 deprived 
the mathematical community of a brilliant young scholar. 

Bertrand Arthur William Russell (1872-1970) was born in Wales and studied at 
Trinity College, Cambridge. A philosopher/mathematician, he is one of the founders 
of modern logic and wrote over 40 books in different areas. In his most famous 
work, Principia Mathematica, published in 1910-13 with Alfred North Whitehead, 
he attempted to deduce the entire body of mathematics from a single set of primitive 
axioms. A pacifist, he fought for progressive causes, including women’s suffrage in 
Great Britain and nuclear disarmament. In 1950 he won a Nobel Prize for literature. 

al-Samaw’al ibn Yahya ibn Yahuda al-Maghribi (1125-1180) was born in Bagh- 
dad to well-educated Jewish parents. Besides giving him a religious education, they 
encouraged him to study medicine and mathematics. He wrote his major mathemat- 
ical work, Al-Bahir ( The Shining ), an algebra text that dealt extensively with the 
algebra of polynomials. In it, al-Samaw’al worked out the laws of exponents, both 
positive and negative, and showed how to divide polynomials even when the division 
was not exact. He also used a form of mathematical induction to prove the binomial 
theorem, that (a + b) n = Ylk = o C( n > k)a n ~ k b k , where the C(n, k) are the entries in 
the Pascal triangle, for n < 12. In fact, he showed why each entry in the triangle 
can be formed by adding two numbers in the previous row. When al-Samaw’al was 
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about 40, he decided to convert to Islam. To justify his conversion to the world, 
he wrote an autobiography in 1167 stating his arguments against Judaism, a work 
which became famous as a source of Islamic polemics against the Jews. 

Claude Elwood Shannon (born 1916) applied Boolean algebra to switching circuits 
in his master’s thesis at M.I.T in 1938. Shannon realized that a circuit can be 
represented by a set of equations and that the calculus necessary for manipulating 
these equations is precisely the Boolean algebra of logic. Simplifying these equations 
for a circuit would yield a simpler, equivalent circuit. Switches in Shannon’s calculus 
were either open (represented by 1) or closed (represented by 0); placing switches 
in parallel was represented by the Boolean operation “+”, while placing them in 
parallel was represented by “ • ” . Using the basic rules of Boolean algebra, Shannon 
was, for example, able to construct a circuit which would add two numbers given in 
binary representation. He received his Ph.D. in mathematics from M.I.T. in 1940 
and spent much of his professional life at Bell Laboratories, where he worked on 
methods of transmitting data efficiently and made many fundamental contributions 
to information theory. 

Janies Stirling (1692-1770) studied at Glasgow University and at Balliol College, 
Oxford and spent much of his life as a successful administrator of a mining company 
in Scotland. His mathematical work included an exposition of Newton’s theory of 
cubic curves and a 1730 book entitled Methodus Differentialis which dealt with 
summation and interpolation formulas. In dealing with the convergence of series, 
Stirling found it useful to convert factorials into powers. By considering tables of 
factorials, he was able to derive the formula for logn!, which leads to what is now 
known as Stirling’s approximation: n! ss (-) n y/2Tr n. Stirling also developed the 
Stirling numbers of the first and second kinds, sequences of numbers important in 
enumeration. 

Sun Zi (4th century) is the author of Sunzi suanjing ( Master Sun’s Mathematical 
Manual ), a manual on arithmetical operations which eventually became part of the 
required course of study for Chinese civil servants. The most famous problem in 
the work is one of the first examples of what is today called the Chinese remainder 
problem: “We have things of which we do not know the number; if we count them by 
threes, the remainder is 2; if we count them by fives, the remainder is 3; if we count 
them by sevens, the remainder is 2. How many things are there?” Sun Zi gives the 
answer, 23, along with some explanation of how the problem should be solved. But 
since this is the only problem of its type in the book, it is not known whether Sun 
Zi had developed a general method of solving simultaneous linear congruences. 

James Joseph Sylvester (1814-1897), who was born into a Jewish family in London 
and studied for several years at Cambridge, was not permitted to take his degree 
there for religious reasons. Therefore, he received his degree from Trinity College, 
Dublin and soon thereafter accepted a professorship at the University of Virginia. His 
horror of slavery, however, and an altercation with a student who did not show him 
the respect he felt he deserved led to his resignation after only a brief tenure. After 
his return to England, he spent 10 years as an attorney and 15 years as professor 
of mathematics at the Royal Military Academy at Woolwich. Sylvester returned to 
the United States in 1871 to accept the chair of mathematics at the newly opened 
Johns Hopkins University in Baltimore, where he founded the American Journal of 
Ma thematics and helped initiate a tradition of graduate education in mathematics in 
the United States. Sylvester’s primary mathematical contributions are in the fields 
of invariant theory and the theory of partitions. 
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John Wilder Tukey (born 1915) received a Ph.D. in topology from Princeton in 
1939. After World War II he returned to Princeton as professor of statistics, where 
he founded the Department of Statistics in 1966. His work in statistics included 
the areas of spectra of time series and analysis of variance. He invented (with J. W 
Cooley) the fast Fourier transform. He was awarded the National Medal of Science 
and served on the President’s Science Advisory Committee. He also coined the word 
“bit” for a binary digit. 

Alan Turing (1912-1954) studied mathematics at King’s College, Cambridge and in 
1936 invented the concept of a Turing machine to answer the questions of what a 
computation is and whether a given computation can in fact be carried out. This 
notion today lies at the basis of the modern all-purpose computer, a machine which 
can be programmed to do any desired computation. At the outbreak of World 
War II, Turing was called to serve at the Government Code and Cypher School in 
Bletchley Park in Buckinghamshire. It was there, during the next few years, that 
he led the successful effort to crack the German “Enigma” code, an effort which 
turned out to be central to the defeat of Nazi Germany. After the war, Turing 
continued his interest in automatic computing machines and so joined the National 
Physical Laboratory to work on the design of a computer, continuing this work after 
1948 at the University of Manchester. Turing’s promising career came to a grinding 
halt, however, when he was arrested in 1952 for homosexual acts. The penalty for 
this “crime” was submission to psychoanalysis and hormone treatments to “cure” 
the disease. Unfortunately, the cure proved worse than the disease, and, in a fit of 
depression, Turing committed suicide in June, 1954. 

Alexandre-Theophile Vandermonde (1735-1796) was directed by his physician fa- 
ther to a career in music. However, he later developed a brief but intense interest in 
mathematics and wrote four important papers published in 1771 and 1772. These 
papers include fundamental contributions to the theory of the roots of equations, 
the theory of determinants, and the knight’s tour problem. In the first paper, he 
showed that any symmetric function of the roots of a polynomial equation can be 
expressed in terms of the coefficients of the equation. His paper on determinants 
was the first logical, connected exposition of the subject, so he can be thought of 
as the founder of the theory. Toward the end of his life, he joined the cause of the 
French revolution and held several different positions in government. 

Frangois Viete (1540-1603), a lawyer and advisor to two kings of France, was one 
of the earliest cryptanalysts and successfully decoded intercepted messages for his 
patrons. In fact, he was so successful in this endeavor that he was denounced by 
some who thought that the decipherment could only have been made by sorcery. Al- 
though a mathematician only by avocation, he made important contributions to the 
development of algebra. In particular, he introduced letters to stand for numerical 
constants, thus enabling him to break away from the style of verbal algorithms of 
his predecessors and treat general examples by formulas rather than by giving rules 
for specific problems. 

Edward Waring (1734-1798) graduated from Magdalen College, Cambridge in 1757 
with highest honors and shortly thereafter was named a Fellow of the University. 
In 1760, despite opposition because of his youth, he was named Lucasian Professor 
of Mathematics at Cambridge, a position he held until his death. To help solidify 
his position, then, he published the first chapter of his major work, Miscellanea 
analytics, which in later editions was renamed Meditationes algebraicae. Waring is 
best remembered for his conjecture that every integer is the sum of at most four 
squares, at most nine cubes, at most 19 fourth powers, and, in general, at most r 
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fcth powers, where r depends on k. The general theorem that there is a finite r for 
each k was proved by Hilbert in 1909. Although the result for squares was proved 
by Lagrange, the specific results for cubes and fourth powers were not proved until 
the twentieth century. 

Hassler Whitney (1907-1989) received bachelor’s degrees in both physics and music 
from Yale; in 1932 he received a doctorate in mathematics from Harvard. After a 
brief stay in Princeton, he returned to Harvard, where he taught until 1952, when he 
moved to the Institute for Advanced Study. Whitney produced more than a dozen 
papers on graph theory in the 1930s, after his interest was aroused by the four color 
problem. In particular, he defined the notion of the dual graph of a map. It was 
then possible to apply many of the results of the theory of graphs to gain insight into 
the four color problem. During the last twenty years of his life, Whitney devoted his 
energy to improving mathematical education, particularly at the elementary school 
level. He emphasized that young children should be encouraged to solve problems 
using their intuition, rather than only be taught techniques and results which have 
no connection to their experience. 
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INTRODUCTION 


This chapter covers material usually referred to as the foundations of mathematics, in- 
cluding logic, sets, and functions. In addition to covering these foundational areas, this 
chapter includes material that shows how these topics are applied to discrete mathe- 
matics, computer science, and electrical engineering. For example, this chapter covers 
methods of proof, program verification, and fuzzy reasoning. 


GLOSSARY 

action: a literal or a print command in a production system. 

aleph-null: the cardinality, Ho, of the set A f of natural numbers. 

AND: the logical operator for conjunction, also written A. 

antecedent: in a conditional proposition p — > q (“if p then q”) the proposition p 
( “if-clause” ) that precedes the arrow. 

antichain: a subset of a poset in which no two elements are comparable. 

antisymmetric: the property of a binary relation R that if aRb and bRa, then a = b. 

argument form: a sequence of statement forms each called a premise of the argument 
followed by a statement form called a conclusion of the argument. 

assertion (or program assertion ): a program comment specifying some conditions 
on the values of the computational variables; these conditions are supposed to hold 
whenever program flow reaches the location of the assertion. 

asymmetric: the property of a binary relation R that if aRb, then bl/la. 

asymptotic: A function / is asymptotic to a function g, written /( x) ~ g{x), if 
f(x) yf 0 for sufficiently large x and lirn^oo = 1. 

atom (or atomic formula): simplest formula of predicate logic. 

atomic formula: See atom. 

atomic proposition: a proposition that cannot be analyzed into smaller parts and 
logical operations. 

automated reasoning: the process of proving theorems using a computer program 
that can draw conclusions that follow logically from a set of given facts. 

axiom : a statement that is assumed to be true; a postulate. 

axiom of choice: the assertion that given any nonempty collection A of pairwise 
disjoint sets, there is a set that consists of exactly one element from each of the sets 
in A. 

axiom (or semantic axiom) : a rule for a programming language construct prescribing 
the change of values of computational variables when an instruction of that construct- 
type is executed. 

basis step: a proof of the basis premise (first case) in a proof by mathematical induc- 
tion. 

big-oh notation: f is O(g), written / = O(g), if there are constants C and k such 
that \f(x)\ < C\g(x)\ for all x > k. 

bisection (or bijective function): a function that is one-to-one and onto. 

bijective function: See bijection. 
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binary relation from a set A to a set B: any subset of A x B. 

binary relation on a set A: a binary relation from A to A ; i.e., a subset of A x A. 

body of a clause Ai , . . . , A n <— B \, . . . , in a logic program: the literals B 1 , . . . , 
after <— . 

cardinal number (or cardinality ) of a set: for a finite set, the number of elements; 
for an infinite set, the order of infinity. The cardinal number of S is written |jS|. 

cardinality: See cardinal number. 

Cartesian product (of sets A and B): the set Ax B oi ordered pairs (a, b) with a £ A 
and b £ B (more generally, the iterated Cartesian product A\ x A 2 x ■ ■ ■ x A n 
is the set of ordered n-tuples (a 1 ,( 12 , . . . , a n ), with a, € A* for each i). 

ceiling (of x): the smallest integer that is greater than or equal to x, written [a:]. 

chain: a subset of a poset in which every pair of elements are comparable. 

characteristic function (of a set S): the function from S to {0, 1} whose value at x 
is 1 if x € S and 0 if x ^ S. 

clause (in a logic program): closed formula of the form Vaq . . . \/x s {A\ V • • • V A n <— 
Br A---A.B m ). 

closed formula: for a function value f(x), an algebraic expression in x. 

closure (of a relation R with respect to a property V): the relation S, if it exists, that 
has property V and contains R, such that S' is a subset of every relation that has 
property V and contains R. 

codomain (of a function): the set in which the function values occur. 

comparable: Two elements in a poset are comparable if they are related by the partial 
order relation. 

complement (of a relation): given a relation R, the relation R where aRb if and only 
if a fib. 

complement (of a set): given a set A in a “universal” domain U, the set A of objects 
in U that are not in A. 

complement operator: a function [0, 1] — > [0, 1] used for complementing fuzzy sets. 

complete: property of a set of axioms that it is possible to prove all true statements. 

complex number : a number of the form a + bi , where a and b are real numbers, and 
i 2 = —1; the set of all complex numbers is denoted C. 

composite key : given an n-ary relation R on A\ x A 2 x ■ ■ ■ x A n , a product of domains 
Ajj x A iz x • • • x A im such that for each m-tuple ( a q , a,; 2 , . . . , a* m ) € A i± x A,; 2 x • • • x 
Aj m , there is at most one n-tuple in R that matches (a^, <Zj 2 , . . . , a lm ) in coordinates 

^ 1 5 ^ 2 ? • • • 1 km % 

composition (of relations): for R a relation from A to B and S a relation from B to 
C, the relation S o R from A to C such that a(S o R)c if and only if there exists 
b £ B such that aRb and bSc. 

composition (of functions): the function fog whose value at x is f(g(x)). 

compound proposition: a proposition built up from atomic propositions and logical 
connectives. 

computer-assisted proof : a proof that relies on checking the validity of a large 
number of cases using a special purpose computer program. 
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conclusion (of an argument form): the last statement of an argument form. 

conclusion (of a proof): the last proposition of a proof; the objective of the proof is 
demonstrating that the conclusion follows from the premises. 

condition: the disjunction Ai V • • • V A n of atomic formulas. 

conditional statement: the compound proposition p — > q ( “if p then <f ) that is true 
except when p is true and q is false. 

conjunction : the compound proposition pAq (“p and q ”) that is true only when p 
and q are both true. 

conjunctive normal form: for a proposition in the variables pi,P2, ■ ■ ■ ,Pm an equiv- 
alent proposition that is the conjunction of disjunctions, with each disjunction of the 
form x kl V Xk 2 V • • • V x km , where x kj is either p kj or ->p kj . 

consequent: in a conditional proposition p — > q (“if p then q”) the proposition q 
(“then-clause”) that follows the arrow. 

consistent: property of a set of axioms that no contradiction can be deduced from the 
axioms. 

construct (or program construct ): the general form of a programming instruction 
such as an assignment, a conditional, or a while-loop. 

continuum hypothesis: the assertion that the cardinal number of the real numbers 
is the smallest cardinal number greater than the cardinal number of the natural 
numbers. 

contradiction: a self-contradictory proposition, one that is always false. 
contradiction (in an indirect proof): the negation of a premise. 

contrapositive (of the conditional proposition p — > q): the conditional proposition 
^q -> ->p. 

converse (of the conditional proposition p — > q) : the conditional proposition q — » p. 
converse relation : another name for the inverse relation. 

corollary: a theorem that is derived as an easy consequence of another theorem. 
correct conclusion: the conclusion of a valid proof, when all the premises are true. 
countable set: a set that is finite or denumerable. 
counterexample : a case that makes a statement false. 
definite clause: clause with at most one atom in its head. 

denumerable set: a set that can be placed in one-to-one correspondence with the 
natural numbers. 

diagonalization proof : any proof that involves something analogous to the diagonal 
of a list of sequences. 

difference: a binary relation R— S such that a(R— S)b if and only if aRb is true 
and aSb is false. 

difference (of sets): the set A — B of objects in A that are not in B. 
direct proof : a proof of p — ■> q that assumes p and shows that q must follow. 
disjoint (pair of sets): two sets with no members in common. 

disjunction: the statement pVq (“p or q”) that is true when at least one of the two 
propositions p and q is true; also called inclusive or. 
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disjunctive normal form: for a proposition in the variables pi,P 2 , ■ ■ ■ , p n , an equiv- 
alent proposition that is the disjunction of conjunctions, with each conjunction of 
the form x fe 1 A Xk 2 A • • • A Xk m , where x . is either p^. or -i pv ■ 

disproof : a proof that a statement is false. 

divisibility lattice: the lattice consisting of the positive integers under the relation 
of divisibility. 

domain (of a function): the set on which a function acts. 

element (of a set): member of the set; the notation a £ A means that a is an element 
of A. 

elementary projection function: the function tt,: X t x • • • x X n — > X, such that 

7r(*l) ...,X n )= Xi . 

empty set: the set with no elements, written 0 or { }. 

epimorphism: an onto function. 

equality (of sets): property that two sets have the same elements. 

equivalence class: given an equivalence relation on a set A and a € A, the subset 
of A consisting of all elements related to a. 

equivalence relation: a binary relation that is reflexive, symmetric, and transitive. 

equivalent propositions: two compound propositions (on the same simple variables) 
with the same truth table. 

existential quantifier: the quantifier 3x, read “there is an x” . 

existentially quantified predicate: a statement (3x)P(x) that there exists a value 
of x such that P(x) is true. 

exponential function: any function of the form b x , b a positive constant, b ^ 1. 

fact set: set of ground atomic formulas. 

factorial (function): the function n\ whose value on the argument n is the product 
1-2-3 . . . n; that is, n! = 1 • 2 • 3 . . . n. 

finite: property of a set that it is either empty or else can be put in a one-to-one 
correspondence with a set {1,2,3,..., n} for some positive integer n. 

first-order logic: See predicate calculus. 

floor (of x): the greatest integer less than or equal to x, written \_x\. 

formula: a logical expression constructed from atoms with conjunctions, disjunctions, 
and negations, possibly with some logical quantifiers. 

full conjunctive normal form: conjunctive normal form where each disjunction is a 
disjunction of all variables or their negations. 

full disjunctive normal form: disjunctive normal form where each conjunction is a 
conjunction of all variables or their negations. 

fully parenthesized proposition: any proposition that can be obtained using the 
following recursive definition: each variable is fully parenthesized, if P and Q are 
fully parenthesized, so are (~>P), (P A Q), (P V Q), (P — > Q), and (P Q). 

function f:A—>B: a rule that assigns to every object a in the domain set A exactly 
one object f(a) in the codomain set B. 

functionally complete set: a set of logical connectives from which all other connec- 
tives can be derived by composition. 
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fuzzy logic: a system of logic in which each statement has a truth value in the inter- 
val [0, 1]. 

fuzzy set: a set in which each element is associated with a number in the interval [0, 1] 
that measures its degree of membership. 

generalized continuum hypothesis: the assertion that for every infinite set S there 
is no cardinal number greater than \S\ and less than IP (S') I- 

goal: a clause with an empty head. 

graph (of a function): given a function f:A—>B , the set { (a, b) | b = /(a) } C Ax B. 

greatest lower bound (of a subset of a poset): an element of the poset that is a lower 
bound of the subset and is greater than or equal to every other lower bound of the 
subset. 

ground formula: a formula without any variables. 

halting function: the function that maps computer programs to the set {0,1}, with 
value 1 if the program always halts, regardless of input, and 0 otherwise. 

Hasse diagram: a directed graph that represents a poset. 

head (of a clause A\, . . . , A n <— B\, . . . , B m ): the literals A\, . . . , A n before . 

identity function (on a set): given a set A, the function from A to itself whose value 
at x is x. 

image set (of a function) : the set of function values as x ranges over all objects of the 
domain. 

implication: formally, the relation P => Q that a proposition Q is true whenever 
proposition P is true; informally, a synonym for the conditional statement p — > q. 

incomparable: two elements in a poset that are not related by the partial order 
relation. 

induced partition (on a set under an equivalence relation): the set of equivalence 
classes under the relation. 

independent: property of a set of axioms that none of the axioms can be deduced 
from the other axioms. 

indirect proof : a proof of p — » q that assumes ~^q is true and proves that ->p is true. 

induction: See mathematical induction. 

induction hypothesis: in a mathematical induction proof, the statement P(xk) in 
the induction step. 

induction step: in a mathematical induction proof, a proof of the induction premise 
“if P(xk ) is true, then P(xk+ i) is true”. 

inductive proof : See mathematical induction. 

infinite (set): a set that is not finite. 

injection (or injective function): a one-to-one function. 

instance (of a formula): formula obtained using a substitution. 

instantiation: substitution of concrete values for the free variables of a statement or 
sequence of statements; an instance of a production rule. 

integer: a whole number, possibly zero or negative; i.e., one of the elements in the set 

^={...,- 2 , - 1 , 0 , 1 , 2 ,...}. 


© 2000 by CRC Press LLC 



intersection: the set A n B of objects common to both sets A and B. 

intersection relation: for binary relations R and S on A , the relation R ft S where 
a(R fl 5)6 if and only if aRb and aSb. 

interval (in a poset): given a < b in a poset, a subset of the poset consisting of all 
elements x such that a < x < b. 

inverse function : for a one-to-one, onto function f:X—*Y, the function / -1 : Y — > X 
whose value at y £ Y is the unique x € X such that f(x) = y. 

inverse image (under /: X — > Y of a subset T C Y): the subset { x £ X \ f(x) £ T}, 
written / _1 (T). 

inverse relation : for a binary relation R from A to B , the relation f? -1 from B to A 
where bR~ 1 a if and only if aRb. 

invertible (function): a one-to-one and onto function; a function that has an inverse. 

irrational number: a real number that is not rational. 

irreflexive: property of a binary relation R on A that al/la , for all a £ A. 

lattice: a poset in which every pair of elements has both a least upper bound and a 
greatest lower bound. 

least upper bound (of a subset of a poset): an element of the poset that is an upper 
bound of the subset and is less than or equal to every other upper bound of the 
subset. 

lemma: a theorem that is an intermediate step in the proof of a more important 
theorem. 

linearly ordered: the property of a poset that every pair of elements are comparable, 
also called totally ordered. 

literal: an atom or its negation. 

little-oh notation: f is o(g) if lim^^oo | | = 0. 

logarithmic function: a function logf, x (b a positive constant, 6^1) defined by the 
rule log b x = y if and only if b y = x. 

logic program: a finite sequence of definite clauses. 

logically equivalent propositions: compound propositions that involve the same 
variables and have the same truth table. 

logically implies: A compound proposition P logically implies a compound proposi- 
tion Q if Q is true whenever P is true. 

loop invariant: an expression that specifies the circumstance under which the loop 
body will be executed again. 

lower bound (for a subset of a poset): an element of the poset that is less than or 
equal to every element of the subset. 

mathematical induction: a method of proving that every item of a sequence of 
propositions such as P(no),P(no + l),P(no + 2), . . . is true by showing: (1) P(no) 
is true, and (2) for all n > n 0 , P(n ) — > P(n + 1) is true. 

maximal element: in a poset an element that has no element greater than it. 

maximum element: in a poset an element greater than or equal to every element. 

membership function (in fuzzy logic): a function from elements of a set to [0,1]. 
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membership table (for a set expression): a table used to calculate whether an ob- 
ject lies in the set described by the expression, based on its membership in the sets 
mentioned by the expression. 

minimal element : in a poset an element that has no element smaller than it. 
minimum element : in a poset an element less than or equal to every element. 
monomorphism: a one-to-one function. 

multi-valued logic: a logic system with a set of more than two truth values. 

multiset: an extension of the set concept, in which each element may occur arbitrarily 
many times. 

mutually disjoint (family of sets): (See pairwise disjoint.) 
n-ary predicate: a statement involving n variables. 
n-ary relation: any subset of A\ x A 2 x • • • x A n . 

naive set theory : set theory where any collection of objects can be considered to be 
a valid set, with paradoxes ignored. 

NAND: the logical connective “not and”. 

natural number: a nonnegative integer (or “counting” number); i.e., an element of 
A f = {0, 1, 2,3,...}. Note: Sometimes 0 is not regarded as a natural number. 

negation: the statement ->p (“not p”) that is true if and only if p is not true. 

NOP: pronounced “no-op” , a program instruction that does nothing to alter the values 
of computational variables or the order of execution. 

NOR: the logical connective “not or”. 

NOT: the logical connective meaning “not”, used in place of 
null set: the set with no elements, written 0 or { }. 

omega notation: f is fi(g) if there are constants C and k such that |c/(a;)| < C\f(x)\ 
for all x > k. 

one-to-one (function): a function /: X — » Y that assigns distinct elements of the 
codomain to distinct elements of the domain; thus, if x\ 7 ^ X 2 , then f(x 1 ) 7 ^ /(a©. 

onto (function): a function f:X -*Y whose image equals its codomain; i.e., for every 
y € Y, there is an x £ X such that f(x) = y. 

OR: the logical operator for disjunction, also written V. 

pairwise disjoint: property of a family of sets that each two distinct sets in the family 
have empty intersection; also called mutually disjoint. 

paradox: a statement that contradicts itself. 

partial function: a function f:X—>Y that assigns a well-defined object in Y to some 
(but not necessarily all) the elements of its domain X. 
partial order: a binary relation that is reflexive, antisymmetric, and transitive. 
partially ordered set: a set with a partial order relation defined on it. 

partition (of a set): given a set S, a pairwise disjoint family V = {Aj} of nonempty 
subsets of S whose union is S. 

Peano definition: a recursive description of the natural numbers that uses the concept 
of successor. 

Polish prefix notation: the style of writing compound propositions in prefix notation 
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where sometime the usual operand symbols are replaced as follows: N for K for A, 
A for V, C for — E for <->. 

poset : a partially ordered set. 

postcondition: an assertion that appears immediately after the executable portion of 
a program fragment or of a subprogram. 

postfix notation: the style of writing compound logical propositions where operators 
are written to the right of the operands. 

power (of a relation): for a relation R on A, the relation R n on A where R° = /, 
R 1 = R and R n = R n ~ x o R for all n > 1. 

power set: given a set A, the set V(A) of all subsets of A. 

precondition: an assertion that appears immediately before the executable portion of 
a program fragment or of a subprogram. 

predicate: a statement involving one or more variables that range over various do- 
mains. 

predicate calculus: the symbolic study of quantified predicate statements. 

prefix notation: the style of writing compound logical propositions where operators 
are written to the left of the operands. 

premise: a proposition taken as the foundation of a proof, from which the conclusion 
is to be derived. 

prenex normal form: the form of a well-formed formula in which every quantifier 
occurs at the beginning and the scope is whatever follows the quantifiers. 

preorder: a binary relation that is reflexive and transitive. 

primary key : for an n-ary relation on Ai, A 2 , . . . , A n , a coordinate domain Aj such 
that for each x G Aj there is at most one n-tuple in the relation whose jth coordinate 
is x. 

production rule: a formula of the form C \, . . . , C n — + A lt . . . , A rn where each C\ is a 
condition and each A,; is an action. 

production system: a set of production rules and a fact set. 

program construct: See construct. 

program fragment: any sequence of program code, from a single instruction to an 
entire program. 

program semantics (or semantics) : the meaning of an instruction or of a program 
fragment; i.e., the effect of its execution on the computational variables. 

projection function: a function defined on a set of n-tuples that selects the elements 
in certain coordinate positions. 

proof (of a conclusion from a set of premises) : a sequence of statements (called steps) 
terminating in the conclusion, such that each step is either a premise or follows from 
previous steps by a valid argument. 

proof by contradiction: a proof that assumes the negation of the statement to be 
proved and shows that this leads to a contradiction. 

proof done by hand: a proof done by a human without the use of a computer. 

proper subset: given a set S, a subset T of S such that S contains at least one element 
not in T. 
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proposition : a declarative sentence or statement that is unambiguously either true or 
false. 

propositional calculus : the symbolic study of propositions. 

range (of a function): the image set of a function; sometimes used as synonym for 
codomain. 

rational number: the ratio % of two integers such that b ^ 0; the set of all rational 
numbers is denoted Q. 

real number: a number expressible as a finite (i.e., terminating) or infinite decimal; 
the set of all real numbers is denoted 1Z. 

recursive definition (of a function with domain A f): a set of initial values and a rule 
for computing f{n) in terms of values f(k) for k < n. 

recursive definition (of a set S): a form of specification of membership of S, in which 
some basis elements are named individually, and in which a computable rule is given 
to construct each other element in a finite number of steps. 

refinement of a partition: given a partition V\ = {A,-} on a set S, a partition 
V 2 = {Bi} on the same set S such that every Bi G is a subset of some Aj € V\. 

reflexive: the property of a binary relation R that aRa. 
relation (from set A to set B): a binary relation from Ato B. 
relation (on a set A): a binary relation from A to A. 

restriction (of a function): given f:X — ► Y and a subset S C X, the function f\S 
with domain S and codomain Y whose rule is the same as that of /. 

reverse Polish notation: postfix notation. 
rule of inference: a valid argument form. 

scope (of a quantifier): the predicate to which the quantifier applies. 

semantic axiom: See axiom. 

semantics: See program semantics. 

sentence: a well-formed formula with no free variables. 

sequence (in a set): a list of objects from a set S, with repetitions allowed; that is, a 
function f:Jf — > S (an infinite sequence, often written ao,ai,a 2 , . . .) or a function 
/: {1,2,..., n} — > S (a finite sequence, often written di,a 2 , • ■ • , a n ). 

set: a well-defined collection of objects. 
singleton: a set with one element. 

specification: in program correctness, a precondition and a postcondition. 

statement form: a declarative sentence containing some variables and logical symbols 
which becomes a proposition if concrete values are substituted for all free variables. 

string: a finite sequence in a set S, usually written so that consecutive entries are 
juxtaposed (i.e., written with no punctuation or extra space between them). 

strongly correct code: code whose execution terminates in a computational state 
satisfying the postcondition, whenever the precondition holds before execution. 

subset of a set S: any set T of objects that are also elements of S, written T C S. 
substitution: a set of pairs of variables and terms. 
surjection (or surjective function): an onto function. 
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symmetric: the property of a binary relation R that if aRb then bRa. 

symmetric difference (of relations) : for relations R and S on A, the relation R(B S 
where a(R ® S)b if and only if exactly one of the following is true: aRb , aSb. 

symmetric difference (of sets): for sets A and B , the set A © B containing each 
object that is an element of A or an element of B , but not an element of both. 

system of distinct representatives: given sets A\, A2, . . . , A n (some of which may 
be equal), a set {ai, <32, . . . , a„} of n distinct elements with at £ Ai for i = 1 , 2 , . . . , n. 

tautology: a compound proposition whose form makes it always true, regardless of 
the truth values of its atomic parts. 

term (in a domain): either a fixed element of a domain S or an S'- valued variable. 

theorem: a statement derived as the conclusion of a valid proof from axioms and 
definitions. 

theta notation: f is ©(5), written / = @(5), if there are positive constants C\, C2, 
and k such that C\\g(x)\ < \f(x)\ < C2\g{x)\ for all x > k. 

totally ordered: the property of a poset that every pair of elements are comparable; 
also called linearly ordered. 

transitive: the property of a binary relation R that if aRb and bRc , then aRc. 

transitive closure: for a relation R on A, the smallest transitive relation containing R. 

transitive reduction (of a relation): a relation with the same transitive closure as 
the original relation and with a minimum number of ordered pairs. 

truth table: for a compound proposition, a table that gives the truth value of the 
proposition for each possible combination of truth values of the atomic variables in 
the proposition. 

two-valued logic: a logic system where each statement has exactly one of the two 
values: true or false. 

union: the set A U B of objects in one or both of the sets A and B. 

union relation : for R and S binary relations on A , the relation RUS where a(RL>S)b 
if and only if aRb or aSb. 

universal domain: the collection of all possible objects in the context of the imme- 
diate discussion. 

universal quantifier : the quantifier \/x, read “for all x” or “for every x ”. 

universally quantified predicate: a statement ( \/x)P{x ) that P(x) is true for ev- 
ery x in its universe of discourse. 

universe of discourse: the range of possible values of a variable, within the context 
of the immediate discussion. 

upper bound (for a subset of a poset): an element of the poset that is greater than 
or equal to every element of the subset. 

valid argument form: an argument form such that in any instantiation where all the 
premises are true, the conclusion is also true. 

Venn diagram: a figure composed of possibly overlapping circles or ellipses, used to 
picture membership in various combinations of the sets. 

verification (of a program): a formal argument for the correctness of a program with 
respect to its specifications. 
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weakly correct code: code whose execution results in a computational state satis- 
fying the postcondition, whenever the precondition holds before execution and the 
execution terminates. 

well- formed formula ( wff ): a proposition or predicate with quantifiers that bind 
one or more of its variables. 

well-ordered: property of a set that every nonempty subset has a minimum element. 

well-ordering principle: the axiom that every nonempty subset of integers, each 
greater than a fixed integer, contains a smallest element. 

XOR: the logical connective “not or”. 

Zermelo-Fraenkel axioms : a set of axioms for set theory. 

zero-order logic: propositional calculus. 


1 .1 PROPOSITIONAL AND PREDICATE LOGIC 

Logic is the basis for distinguishing what may be correctly inferred from a given collec- 
tion of facts. Propositional logic, where there are no quantifiers (so quantifiers range 
over nothing) is called zero-order logic. Predicate logic, where quantifiers range over 
members of a universe, is called first-order logic. Higher-order logic includes second- 
order logic (where quantifiers can range over relations over the universe), third-order 
logic (where quantifiers can range over relations over relations), and so on. Logic has 
many applications in computer science, including circuit design (§5.8.3) and verification 
of computer program correctness (§1.6). This section defines the meaning of the sym- 
bolism and various logical properties that are usually used without explicit mention. 
[FlPa88], [Me79], [Mo76] 

In this section, only two- valued logic is studied; i.e., each statement is either true 
or false. Multi-valued logic, in which statements have one of more than two values, is 
discussed in §1.7.2. 


1 . 1 .1 PROPOSITIONS AND LOGICAL OPERATIONS 
Definitions: 

A truth value is either true or false, abbreviated T and F, respectively. 

A proposition (in a natural language such as English) is a declarative sentence that 
has a well-defined truth value. 

A propositional variable is a mathematical variable, often denoted by p, q, or r, that 
represents a proposition. 

Propositional logic (or propositional calculus or zero-order logic ) is the study 
of logical propositions and their combinations using logical connectives. 

A logical connective is an operation used to build more complicated logical expressions 
out of simpler propositions, whose truth values depend only on the truth values of the 
simpler propositions. 
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A proposition is atomic or simple if it cannot be syntactically analyzed into smaller 
parts; it is usually represented by a single logical variable. 

A proposition is compound if it contains one or more logical connectives. 

A truth table is a table that prescribes the defining rule for a logical operation. That 
is, for each combination of truth values of the operands, the table gives the truth value 
of the expression formed by the operation and operands. 

The unary connective negation (denoted by ->) is defined by the following truth table: 


p 

~^p 

T 

F 

F 

T 


Note: The negation is also written p', p, or ~p. 
The common binary connectives are: 


p/\q 
PV q 

p -*• q 
p q 
p® q 
pIq 

P I q or p t q 


conjunction 
disjunction 
conditional 
biconditional 
exclusive or 
not or 
not and 


p and q 
p or q 
if p then q 
p if and only if q 
p xor q 
p nor q 
p nand q 


The connective | is called the Sheffer stroke. The connective J, is called the Peirce arrow. 
The values of the compound propositions obtained by using the binary connectives are 
given in the following table: 


P 

q 

pVg 

pAq 

p-* q 

p^ q 

p® q 

p lq 

p\q 

T 

T 

T 

T 

T 

T 

F 

F 

F 

T 

F 

T 

F 

F 

F 

T 

F 

T 

F 

T 

T 

F 

T 

F 

T 

F 

T 

F 

F 

F 

F 

T 

T 

F 

T 

T 


In the conditional p — » q, p is the antecedent and q is the consequent. The conditional 
p — > q is often read informally as “p implies q” . 

Infix notation is the style of writing compound propositions where binary operators 
are written between the operands and negation is written to the left of its operand. 

Prefix notation is the style of writing compound propositions where operators are 
written to the left of the operands. 

Postfix notation (or reverse Polish notation) is the style of writing compound 
propositions where operators are written to the right of the operands. 

Polish notation is the style of writing compound propositions where operators are 
written using prefix notation and where the usual operand symbols are replaced as 
follows: N for K for A, A for V, C for — E for <->. (Jan Lukasiewicz, 1878-1956) 

A fully parenthesized proposition is any proposition that can be obtained using the 
following recursive definition: each variable is fully parenthesized, if P and Q are fully 
parenthesized, so are (~>P), (P A Q), (PVQ), (P — > Q), and (P <-> Q). 
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Facts: 

1. The conditional connective p — > q represents the following English constructs: 

• if p then q • q if p 

• p only if q • p implies q 

• q follows from p • q whenever p 

• p is a sufficient condition for q • q is a necessary condition for p. 

2. The biconditional connective p <-> q represents the following English constructs: 

• p if and only if q (often written p iff q) 

• p and q imply each other 

• p is a necessary and sufficient condition for q 

• p and q are equivalent. 

3. In computer programming and circuit design, the following notation for logical op- 
erators is used: p AND q for p A q, p OR q for p V q , NOT p for -i p, p XOR q for p ® q, 
p NOR q for p J, q, p NAND q for p \ q. 

4. Order of operations: In an unparenthesized compound proposition using only the 
five standard operators -i, A, V, — and <->, the following order of precedence is typically 
used when evaluating a logical expression, at each level of precedence moving from left to 
right: first -i, then A and V, then — finally <->. Parenthesized expressions are evaluated 
proceeding from the innermost pair of parentheses outward, analogous to the evaluation 
of an arithmetic expression. 

5. It is often preferable to use parentheses to show precedence, except for negation 
operators, rather than to rely on precedence rules. 

6. No parentheses are needed when a compound proposition is written in either prefix or 
postfix notation. However, parentheses may be necessary when a compound proposition 
is written in infix notation. 

7. The number of nonequivalent logical statements with two variables is 16, because 
each of the four lines of the truth table has two possible entries, T or F. Here are 
examples of compound propositions that yield each possible combination of truth values. 
(T represents a tautology and F a contradiction. See §1.1.2.) 



8. The number of different possible logical connectives on n variables is 2 2 ", because 
there are 2" rows in the truth table. 

Examples: 

1. “1+1 = 3” and “Romulus and Remus founded New York City” are false propositions. 

2. “1 + 1 = 2” and “The year 1996 was a leap year” are true propositions. 

3. “Go directly to jail” is not a proposition, because it is imperative, not declarative. 
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4. “x > 5” is not a proposition, because its truth value cannot be determined unless 
the value of x is known. 

5. “This sentence is false” is not a proposition, because it cannot be given a truth value 
without creating a contradiction. 

6 . In a truth table evaluation of the compound proposition p V (~>p A q) from the 
innermost parenthetic expression outward, the steps are to evaluate ->p, next (~>p A q), 
and then p V (~>p A q): 


p 

q 

~^p 

hp A q) 

p V (~>p A q) 

T 

T 

F 

F 

T 

T 

F 

F 

F 

T 

F 

T 

T 

T 

T 

F 

F 

T 

F 

F 


7. The statements in the left column are evaluated using the order of precedence indi- 
cated in the fully parenthesized form in the right column: 

p V q A r ((pVg)Ar) 

p <-> q — > r (p (q — * ► r)) 

->q V ->r — > s A t V (“>r)) — > (s A f)) 

8 . The infix statement p A <7 in prefix notation is Apg, in postfix notation is pq A, and 
in Polish notation is Kpq. 

9. The infix statement p — > ©gVr) in prefix notation is — Vgr, in postfix notation 
is p q r V -1 — >, and in Polish notation is CpNAgr. 


1 .1 .2 EQUIVALENCES, IDENTITIES, AND NORMAL FORMS 
Definitions: 

A tautology is a compound proposition that is always true, regardless of the truth 
values of its underlying atomic propositions. 

A contradiction (or self-contradiction) is a compound proposition that is always 
false, regardless of the truth values of its underlying atomic propositions. (The term 
self-contradiction is used for such a proposition when discussing indirect mathematical 
arguments, because “contradiction” has another meaning in that context. See §1.5.) 

A compound proposition P logically implies a compound proposition Q, written 
P => Q, if Q is true whenever P is true. In this case, P is stronger than Q , and Q is 

weaker than P. 

Compound propositions P and Q are logically equivalent , written /’ = Q, P <=> Q. or 
P iff Q , if they have the same truth values for all possible truth values of their variables. 

A logical equivalence that is frequently used is sometimes called a logical identity. 

A collection C of connectives is functionally complete if every compound proposition 
is equivalent to a compound proposition constructed using only connectives in C. 

A disjunctive normal expression in the propositions pi,P 2 , ■ ■ ■ ,Pn is a disjunction of 
one or more propositions, each of the form Xk r A Xk 2 A • • • A Xk m , where Xk d is either pk j 
or - 1 p kj . 

A disjunctive normal form ( DNF ) for a proposition P is a disjunctive normal ex- 
pression that is logically equivalent to P. 
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A conjunctive normal expression in the propositions p \ ,p 2 , ■ ■ ■ , p n is a conjunction 
of one or more compound propositions, each of the form V Xk 2 V • • • V Xk m , where xv 
is either p kj or ~^p kj ■ 

A conjunctive normal form (CNF) for a proposition P is a conjunctive normal 
expression that is logically equivalent to P. 

A compound proposition P using only the connectives A, and V has a logical dual 
(denoted P' or P d ), obtained by interchanging A and V and interchanging the constant 
T (true) and the constant F (false). 

The converse of the conditional proposition p — > q is the proposition q—>p. 

The contrapositive of the conditional proposition p — > q is the proposition ~^q — > -<p. 
The inverse of the conditional proposition p — > q is the proposition —>p — > -i q. 

Facts: 

1. P <t=> Q is true if and only if P => Q and Q => P. 

2. P <t=> Q is true if and only if P <-> Q is a tautology. 

3. Table 1 lists several logical identities. 

4. There are different ways to establish logical identities (equivalences): 

• truth tables (showing that both expressions have the same truth values); 

• using known logical identities and equivalence to establish new ones; 

• taking the dual of a known identity (Fact 7). 

5. Logical identities are used in circuit design to simplify circuits. See §5.8.4. 

6. Each of the following sets of connectives is functionally complete: 

{A, V , >}, {A, •}, { V, — i}, { | }, {!}. 

However, these sets of connectives are not functionally complete: 

{A}, {V}, {A, V}. 

7. If P <t=> Q is a logical identity, then so is P' <t=> Q' , where P 1 and Q' are the duals 
of P and Q , respectively. 

8. Every proposition has a disjunctive normal form and a conjunctive normal form, 
which can be obtained by Algorithms 1 and 2. 


Algorithm 1 : Disjunctive normal form of proposition P . 

write the truth table for P 

for each line of the truth table on which P is true, form a “line term” 
x\ A X2 A • • • A x n , where Xi := pi if pi is true on that line of the truth table 
and Xi := ~>Pi if pi is false on that line 
form the disjunction of all these line terms 


Algorithm 2: Conjunctive normal form of proposition P> 

write the truth table for P 

for each line of the truth table on which P is false, form a “line term” 

X\ V X 2 V • • • V x n , where Xi := pi if pi is false on that line of the truth table 
and Xi := ->Pi if Pi is true on that line 
form the conjunction of all these line terms 
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Table 1 Logical identities. 


name 

rule 

Commutative laws 

pAgOgAp pVgOgVp 

Associative laws 

pA(gAr)»(pAg)Ar p V (g V r) <t=> (p V g) V r 

Distributive laws 

pA(gVr)o(pAg)V(pAr) 

pV(gAr)o(pVg)A(pVr) 

DeMorgan ’s laws 

©P Ag) ©p) V ©g) -^(pVg)« ©p) A ©g) 

Excluded middle 

p V ->p T 

Contradiction 

p A ->p <t=t F 

Double negation law 

“■©p) p 

Contrapositive law 

p — » g <t=> -ig — » -ip 

Conditional as disjunction 

p — > g <t=> -ip V g 

Negation of conditional 

©P~> g)«pA^g 

Biconditional as implication 

(p <-► g) <=> (p -> g) A (g -> p) 

Idempotent laws 

pAp<t=>p pVp<t=>p 

Absorption laws 

p A (p V g) p p V (p A g) p 

Dominance laws 

pVT»T pAFoF 

Exportation law 

p — > (g — > r) <t=> (p A g) — > r 

Identity laws 

pAT<t=>p p V F <tA p 


Examples: 

1. The proposition p V ->p is a tautology (the law of the excluded middle). 

2. The proposition p A ->p is a self-contradiction. 

3. A proof that p <-> q is logically equivalent to (p A q) V ©p A ->g) can be carried out 
using a truth table: 


P 

9 

p^ q 

-‘P 

~^q 

pAq 

->p A -ig 

(p A g) V ©p A -ig) 

T 

T 

T 

F 

F 

T 

F 

T 

T 

F 

F 

F 

T 

F 

F 

F 

F 

T 

F 

T 

F 

F 

F 

F 

F 

F 

T 

T 

T 

F 

T 

T 


Since the third and eighth columns of the truth table are identical, the two statements 
are equivalent. 


4. A proof that p <-> q is logically equivalent to (p A q) V ©p A ->g) can be given by a 
series of logical equivalences. Reasons are given at the right. 


p^ q yy (p -> q) A(q-*p) 

•O- (-• p V q) A © q V p) 

<=> [©p V g) A ->g] V [©p V g) A p] 

<=> [©p A -ig) V (g A ->g)] V [©p A p) V (g A p)] 
<t=> [(— >p A -ig) V F] V [F V (g A p)] 

<£=> [©p A -ig) V F] V [(g A p) V F] 

<t=> (-i p A -ig) V (g A p) 

<t=> (-ip A -ig) V (p A g) 

O (p A g) V (->p A -ig) 


biconditional as implication 
conditional as disjunction 
distributive law 
distributive law 
contradiction 
commutative law 
identity law 
commutative law 
commutative law 
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5. The proposition p J, q is logically equivalent to ->(p V q). Its DNF is ~>p A ~>q, and its 
CNF is (-1 p V ~<q) A ( ->p V q) A (p V ~<q). 

6. The proposition p\q is logically equivalent to ~>(p A q). Its DNF is (p A ~^q) V ©p A 
q) V (-i p A ~^q), and its CNF is ~<p V ~^q. 

7. The DNF and CNF for Examples 5 and 6 were obtained by using Algorithm 1 and 
Algorithm 2 to construct the following table of terms: 


p q 

p | q DNF terms CNF terms 

T T 

T F 

F T 

F F 

F -<p V ~^q 

T p A~iq 

T -ip A p 

T -<p A -i<7 


p q 

p l q DNF terms CNF terms 

T T 

T F 

F T 

F F 

F -<p\/ ~^q 

F -ipV q 

F p V -i<7 

T -<p A -i<7 


8. The dual of p A (q V ~>r) is p V (q A ~>r). 

9. Let S be the proposition in three propositional variables p, q , and r that is true 
when precisely two of the variables are true. Then the disjunctive normal form for S is 

(p A q A ->r) V (p A ~^q A r) V ©p AgAr) 
and the conjunctive normal form for S' is 

(-> p V V -ir) A ( -ip V g V r) A (p V ->(7 V r) A (p V g V ->r) A(pVgVr). 


1.1.3 PREDICATE LOGIC 
Definitions: 

A predicate is a declarative statement with the symbolic form P(x ) or P(x i, . . . ,x n ) 
about one or more variables x or Xi, . . . , x n whose values are unspecified. 

Predicate logic (or predicate calculus or first-order logic) is the study of state- 
ments whose variables have quantifiers. 

The universe of discourse (or universe or domain) of a variable is the set of possible 
values of the variable in a predicate. 

An instantiation of the predicate P(x) is the result of substituting a fixed constant 
value c from the domain of x for each free occurrence of x in P{x). This is denoted by 
P(c). 

The existential quantification of a predicate P(x) whose variable ranges over a do- 
main set D is the proposition (3a: € D)P(x) or (3 x)P(x) that is true if there is at least 
one c in D such that P(c) is true. The existential quantifier symbol , 3, is read “there 
exists” or “there is”. 

The universal quantification of a predicate P( x) whose variable ranges over a domain 
set D is the proposition (\/x G D)P(x) or (\/x)P(x), which is true if P(c) is true for 
every element c in D. The universal quantifier symbol , V, is read “for all”, “for each”, 
or “for every”. 

The unique existential quantification of a predicate P(x) whose variable ranges 
over a domain set D is the proposition (3!a;)P(a;) that is true if P(c) is true for exactly 
one c in D. The unique existential quantifier symbol , 3!, is read “there is exactly one”. 

The scope of a quantifier is the predicate to which it applies. 
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A variable x in a predicate P(x) is a bound variable if it lies inside the scope of an 
^-quantifier. Otherwise it is a free variable. 

A well-formed formula (wff) (or statement) is either a proposition or a predicate 
with quantifiers that bind one or more of its variables. 

A sentence ( closed wff) is a well-formed formula with no free variables. 

A well-formed formula is in prenex normal form if all the quantifiers occur at the 
beginning and the scope is whatever follows the quantifiers. 

A well-formed formula is atomic if it does not contain any logical connectives; otherwise 
the well-formed formula is compound. 

Higher-order logic is the study of statements that allow quantifiers to range over 
relations over a universe (second-order logic), relations over relations over a universe 
(third-order logic), etc. 

Facts: 

1. If a predicate P(x) is atomic, then the scope of (Vx) in (Vx)P(x) is implicitly the 
entire predicate P(x). 

2. If a predicate is a compound form, such as P(x) A Q(x), then (Vx)[P(x) A Q(x)] 
means that the scope is P(x) A Q(x), whereas (Vx)P(x) A Q(x) means that the scope is 
only P(x), in which case the free variable x of the predicate Q(x) has no relationship 
to the variable x of P(x). 

3. Universal statements in predicate logic are analogues of conjunctions in propositional 
logic. If variable x has domain D = {xi, . . . , x n }, then (Vx £ D)P(x) is true if and only 
if P(xi) A • • • A P(x„) is true. 

4. Existential statements in predicate logic are analogues of disjunctions in proposi- 
tional logic. If variable x has domain D = {xi, . . . , x„}, then (3x £ D)P(x) is true if 
and only if P(xi) V • • • V P(x n ) is true. 

5. Adjacent universal quantifiers [existential quantifiers] can be transposed without 
changing the meaning of a logical statement: 

(Vx)(Vy)P(x, y) o (Vy)(Vx)P(x,y) 

(3x)(3y)P(x,y) O (3y)(3x)P(x,y). 

6. Transposing adjacent logical quantifiers of different types can change the meaning 
of a statement. (See Example 4.) 

7. Rules for negations of quantified statements: 

-■(Vx)P(x) <t=> (3x)[->P(x)] 

-■(3x)P(x) <t=> (Vx)[-iP(x)] 

-i(3!x)P(x) ~i(3x)P(x) V (3j/)(3z)[(y ± z) AP(y) AP(z)]. 

8. Every quantified statement is logically equivalent to some statement in prenex nor- 
mal form. 

9. Every statement with a unique existential quantifier is equivalent to a statement 
that uses only existential and universal quantifiers, according to the rule: 

(3!x)P(x) (3x) [P(x) A (Vj/)[P(y) -> (x = y)]] 

where P{y) means that y has been substituted for all free occurrences of x in P(x), and 
where y is a variable that does not occur in P(x). 
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10 . If a statement uses only the connectives V, A, and the following equivalences 
can be used along with Fact 7 to convert the statement into prenex normal form. The 
letter A represents a wff without the variable x. 


(Va;)P(a;) 

A (Va;)Q(a:) 


(Va;)[P(a;) A Q(x )] 

(Va;)P(a;) 

V (Va;)Q(a:) 


(Va;)(Vy)[P(a;) V Q(y)] 

(3a;)P(a:) 

A (3a;)Q(a:) 


(3x)(3y)[P(x)AQ(y)} 

(3a;)P(a:) 

V (3a;)Q(a;) 


(3a;)[P(a;) V Q(x)[ 

(Va;)P(a;) 

A (3a;)Q(a:) 


(Va;)(3y)[P(x) A Q(y)} 

(Va;)P(a;) 

V (3a;)Q(a:) 


(V*)(3y)[P(a;)VQ(y)] 

A 

V (Va;)P( x) 


(Va;)[A V P(a;)] 

A 

V (3a :)P(x) 


(3a;) [A V P(a:)] 

A 

A ( \/x)P(x ) 


(Va;)[A A P(a;)] 

A 

A (3a;)P(a:) 


(3a;) [A A P(x)]. 


Examples: 

1. The statement (Va; € P)(Vy € 1Z) [x + y = y + x] is syntactically a predicate pre- 
ceded by two universal quantifiers. It asserts the commutative law for the addition of 
real numbers. 

2 . The statement (Va;)(3y) [xy = 1] expresses the existence of multiplicative inverses 
for all number in whatever domain is under discussion. Thus, it is true for the positive 
real numbers, but it is false when the domain is the entire set of reals, since zero has no 
multiplicative inverse. 

3. The statement (\/x ^ 0)(3 y) [xy = 1] asserts the existence of multiplicative inverses 
for nonzero numbers. 

4 . (Va;) (3y) [x + y = 0] expresses the true proposition that every real number has an 
additive inverse, but (3y)(Va;) [x+y = 0] is the false proposition that there is a “universal 
additive inverse” that when added to any number always yields the sum 0. 

5. In the statement (Va; £ 1Z) [x + y = y + x\, the variable x is bound and the variable y 
is free. 

6. “Not all men are mortal” is equivalent to “there exists at least one man who is not 
mortal” . Also, “there does not exist a cow that is blue” is equivalent to the statement 
“every cow is a color other than blue” . 

7. The statement (Va;) P( x) — > (Va;) Q{ x) is not in prenex form. An equivalent prenex 
form is (Va;)(3 y) [P(y) — > Q(x)]. 

8. The following table illustrates the differences in meaning among the four different 
ways to quantify a predicate with two variables: 


statement 

meaning 

(3a;) (3 y) [x + y = 0] 
(Va;)(3j/) [x + y = 0] 
(3a;) (Vy) [x + y = 0] 
(Va;)(Vy) [x + y = 0] 

There is a pair of numbers whose sum is zero. 
Every number has an additive inverse. 

There is a universal additive inverse x. 

The sum of every pair of numbers is zero. 


9 . The statement (Va;)(3!y) [x + y = 0] asserts the existence of unique additive inverses. 
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1.2 SET THEORY 

Sets are used to group objects and to serve as the basic elements for building more 
complicated objects and structures. Counting elements in sets is an important part of 
discrete mathematics. 

Some general reference books that cover the material of this section are [FlPa88], 
[Ha60], [Ka50] . 


1.2.1 SETS 

Definitions: 

A set is any well-defined collection of objects, each of which is called a member or an 
element of the set. The notation x € A means that the object a: is a member of the 
set A. The notation x £ A means that x is not a member of A. 

A roster for a finite set specifies the membership of a set S' as a list of its elements 
within braces, i.e., in the form S = {ai, . . . , a„}. Order of the list is irrelevant, as is the 
number of occurrences of an object in the list. 

A defining predicate specifies a set in the form S = {x | P(x) }, where P( x) is a 
predicate containing the free variable x. This means that S is the set of all objects x 
(in whatever domain is under discussion) such that P(x) is true. 

A recursive description of a set S gives a roster B of basic objects of S and a set 
of operations for constructing additional objects of S from objects already known to be 
in S. That is, any object that can be constructed by a finite sequence of applications 
of the given operations to objects in B is also a member of S. There may also be a list 
of axioms that specify when two sequences of operations yield the same result. 

The set with no elements is called the null set or the empty set , denoted 0 or { }. 

A singleton is a set with one element. 

The set A f of natural numbers is the set {0, 1,2,.. .}. (Sometimes 0 is excluded from 
the set of natural numbers; when the set of natural numbers is encountered, check to 
see how it is being defined.) 

The set Z of integers is the set {. . . , —2, — 1, 0, 1,2,...}. 

The set Q of rational numbers is the set of all fractions | where a is any integer and b 
is any nonzero integer. 

The set 1Z of real numbers is the set of all numbers that can be written as terminating 
or nonterminating decimals. 

The set C of complex numbers is the set of all numbers of the form a + bi, where 
a,b GlZ and i = v / — T ( i 2 = —1). 

Sets A and B are equal , written A = B, if they have exactly the same elements: 

A = B (Vx) [(ar € A) <-> (x € B )] . 

Set B is a subset of set A , written B C A or A D B, if each element of B is an element 
of A: 

B C A <^> (Vx) [(# € B) — > (a: € A)] . 
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Set B is a proper subset of A if B is a subset of A and A contains at least one element 
not in B. (The notation B C A is often used to indicate that B is a proper subset of A, 
but sometimes it is used to mean an arbitrary subset. Sometimes the proper subset 
relationship is written B^A, to avoid all possible notational ambiguity.) 

A set is Unite if it is either empty or else can be put in a one-to-one correspondence 
with the set {1, 2,3,..., n} for some positive integer n. 

A set is infinite if it is not finite. 

The cardinality S' of a finite set S is the number of elements in S'. 

A multiset is an unordered collection in which elements can occur arbitrarily often, 
not just once. The number of occurrences of an element is called its multiplicity. 

An axiom ( postulate ) is a statement that is assumed to be true. 

A set of axioms is consistent if no contradiction can be deduced from the axioms. 

A set of axioms is complete if it is possible to prove all true statements. 

A set of axioms is independent if none of the axioms can be deduced from the other 
axioms. 

A set paradox is a question in the language of set theory that seems to have no 
unambiguous answer. 

Naive set theory is set theory where any collection of objects can be considered to 
be a valid set, with paradoxes ignored. 

Facts: 

1. The theory of sets was first developed by Georg Cantor (1845-1918). 

2 . A = B if and only if A C B and B C A. 

3. N c Z c QclZcC. 

4 . Every rational number can be written as a decimal that is either terminating or else 
repeating (i.e., the same block repeats end-to-end forever). 

5 . Real numbers can be represented as the points on the number line, and include all 
rational numbers and all irrational numbers (such as \[2, it, e, etc.). 

6. There is no set of axioms for set theory that is both complete and consistent. 

7 . Naive set theory ignores paradoxes. To avoid such paradoxes, more axioms are 
needed. 

Examples: 

1. The set { x € J\f | 3 < x < 10 }, described by the defining predicate 3 < x < 10 is 
equal to the set {3, 4, 5, 6, 7, 8, 9}, which is described by a roster. 

2. If A is the set with two objects, one of which is the number 5 and other the set 
whose elements are the letters x, y, and z, then A = {5 ,{x,y,z}}. In this example, 
5 £ A, but x ^ A, since x is not either member of A. 

3 . The set E of even natural numbers can be described recursively as follows: 

Basic objects: 0 G E, 

Recursion rule: if n £ E, then n + 2 £ E. 

4 . The liar's paradox: A person says “I am lying”. Is the person lying or is the person 
telling the truth? If the person is lying, then “I am lying” is false, and hence the person 
is telling the truth. If the person is telling the truth, then “I am lying” is true, and 
the person is lying. This is also called the paradox of Epimenides. This paradox also 
results from considering the statement “This statement is false”. 


© 2000 by CRC Press LLC 



5. The barber paradox: In a small village populated only by men there is exactly one 
barber. The villagers follow the following rule: the barber shaves a man if and only 
if the man does not shave himself. Question: does the barber shave himself? If “yes” 
(i.e. , the barber shaves himself), then according to the rule he does not shave himself. If 
“no” (i.e., the barber does not shave himself), then according to the rule he does shave 
himself. This paradox illustrates a danger in describing sets by defining predicates. 

6. Russell’s paradox: This paradox, named for the British logician Bertrand Russell 
(1872-1970), shows that the “set of all sets” is an ill-defined concept. If it really were a 
set, then it would be an example of a set that is a member of itself. Thus, some “sets” 
would contain themselves as elements and others would not. Let S be the “set” of “sets 
that are not elements of themselves”; i.e., S = {A \ A £ A}. Question: is S' a member 
of itself? If “yes” , then S is not a member of itself, because of the defining membership 
criterion. If “no”, then S is a member of itself, due to the defining membership criterion. 
One resolution is that the collection of all sets is not a set. (See Chapter 4 of [MiRo91].) 

7. Paradoxes such as those in Example 6 led Alfred North Whitehead (1861-1947) and 
Bertrand Russell to develop a version of set theory by categorizing sets based on set 

types: T 0 ,Ti, The lowest type, T 0 , consists only of individual elements. For i > 0, 

type Tj consists of sets whose elements come from type T)_i. This forces sets to belong 
to exactly one type. The expression A £ A is always false. In this situation Russell’s 
paradox cannot happen. 


1.2.2 SET OPERATIONS 
Definitions: 

The intersection of sets A and B is the set A n B = { x \ (x £ A) A (x £ B) }. More 
generally, the intersection of any family of sets is the set of objects that are members of 
every set in the family. The notation 

Hie/ Ai = { x | x £ Ai for all i € I } 

is used for the intersection of the family of sets A, indexed by the set I. 

Two sets A and B are disjoint if A (~l B = 0. 

A collection of sets { a* | i € I } is disjoint if rw^ = 0- 

A collection of sets is pairwise disjoint (or mutually disjoint ) if every pair of sets 
in the collection are disjoint. 

The union of sets A and B is the set AuB = { x | (x € A) V (x € B) }. More generally, 
the union of a family of sets is the set of objects that are members of at least one set in 
the family. The notation 

(J ie/ Ai = { x | x £ Ai for some i £ I } 
is used for the union of the family of sets A t indexed by the set I. 

A partition of a set S' is a pairwise disjoint family V = {Ai} of nonempty subsets 
whose union is S. 

The partition V2 = {Bi} of a set S is a refinement of the partition V\ = {Aj} of the 
same set if for every subset Bi £ V 2 there is a subset Aj £ V\ such that Bi C Aj . 

The complement of the set A is the set A = U — A = {x \ x £ A} containing every 
object not in A, where the context provides that the objects range over some specific 
universal domain U. (The notation A' or A c is sometimes used instead of A.) 
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The set difference is the set A — B = A C\ B = {x \ (x € A) A (x £ B)}. The set 
difference is sometimes written A \ B. 

The symmetric difference of A and B is the set A® B = {x \ (x & A — B) V (x € 
B — A)}. This is sometimes written AAB. 

The Cartesian product Ax B of two sets A and B is the set { (a, b) | (a € A) A (b £ 
B ) }, which contains all ordered pairs whose first coordinate is from A and whose second 
coordinate is from B. The Cartesian product of A\ , . . . , A n is the set A\ x A 2 x ■ ■ ■ x A n = 
THU Ai = { (ai, 02 , , a„) | (Vi)(aj € Ai) }, which contains all ordered n-tuples whose 
itli coordinate is from Aj. The Cartesian product A x A x ■ ■ ■ x A is also written A n . 
If S is any set, the Cartesian product of the collection of sets A a , where s € S', is the 
set rises As of all functions /: S — > U s eS A s such that f(s) € A s for all s € S. 

The power set of A is the set V{A) of all subsets of A. The alternative notation 2 A 
for V(A) emphasizes the fact that the power set has 2 n elements if A has n elements. 

A set expression is any expression built up from sets and set operations. 

A set equation (or set identity ) is an equation whose left side and right side are both 
set expressions. 

A system of distinct representatives (SDR) for a collection of sets Ai, A 2 , . . . , A n 
(some of which may be equal) is a set {ai, 02 , . . . , a„} of n distinct elements such that 
a, e Ai for * = 1 , 2 , . . . , n. 

A Venn diagram is a family of n simple closed curves (typically circles or ellipses) 
arranged in the plane so that all possible intersections of the interiors are nonempty 
and connected. (John Venn, 1834-1923) 

A Venn diagram is simple if at most two curves intersect at any point of the plane. 

A Venn diagram is reducible if there is a sequence of curves whose iterative removal 
leaves a Venn diagram at each step. 

A membership table is a table used to calculate whether an object lies in the set 
described by a set expression, based on its membership in the sets mentioned by the 
expression. 

Facts: 

1. If a collection of sets is pairwise disjoint, then the collection is disjoint. The converse 
is false. 



2 . The following figure illustrates Venn diagrams for two and three sets. 

3 . The following figure gives the Venn diagrams for sets constructed using various set 
operations. 



u A, 


B A 



A- B 


(AnB)-C 
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4 . Intuition regarding set identities can be gleaned from Venn diagrams, but it can be 
misleading to use Venn diagrams when proving theorems unless great care is taken to 
make sure that the diagrams are sufficiently general to illustrate all possible cases. 

5 . Venn diagrams are often used as an aid to inclusion/exclusion counting. (See §2.4.) 

6. Venn gave examples of Venn diagrams with four ellipses and asserted that no Venn 
diagram could be constructed with five ellipses. 

7 . Peter Hamburger and Raymond Pippert (1996) constructed a simple, reducible Venn 
diagram with five congruent ellipses. (Two ellipses are congruent if they are the exact 
same size and shape, and differ only by their placement in the plane.) 

8. Many of the logical identities given in §1.1.2 correspond to set identities, given in 
the following table. 


name 

rule 

Commutative laws 

An B = B n A AU B = B U A 

Associative laws 

A n (B n C) = (A n B) n c 


A U (B U C) = (A U B) U C 

Distributive laws 

An{BuC) = {AnB)u{AnC) 


Au{BnC) = {AuB)n{AuC) 

DeMorgan ’s laws 

An b = An b An b = An b 

Complement laws 

AnA = d) AllA = U 

Double complement law 

A= A 

Idempotent laws 

An A = A Au A = A 

Absorption laws 

A n (A U B) = A A U (A n B) = A 

Dominance laws 

Hn0 = 0 AuU = u 

Identity laws 

Au<D = A Anu = A 


9 . In a computer, a subset of a relatively small universal domain can be represented by 
a bit string. Each bit location corresponds to a specific object of the universal domain, 
and the bit value indicates the presence (1) or absence (0) of that object in the subset. 

10 . In a computer, a subset of a relatively large ordered datatype or universal domain 
can be represented by a binary search tree. 

11. For any two finite sets A and B , |HUH| = \A\ + \B\ — \AClB\ (inclusion/exclusion 
principle). (See §2.3.) 

12. Set identities can be proved by any of the following: 

• a containment proof: show that the left side is a subset of the right side and the 

right side is a subset of the left side; 

• a membership table: construct the analogue of the truth table for each side of 

the equation; 

• using other set identities. 

13 . For all sets A, \A\ < \P(A)\. 
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14. Hall’s theorem: A collection of sets A 4 , A 2 , . . . , A n has a system of distinct repre- 
sentatives if and only if for all k = 1, . . . , n every collection of k subsets A tl , A i2 , . . . , A ik 
satisfies \A ii U A i2 U • • • U A ik \>k. 

15. If a collection of sets A 3 , A 2 , . . . , A n has a system of distinct representatives and if 
an integer m has the property that |A;| > m for each i, then: 

• if m>n there are at least systems of distinct representatives; 

• if m < n there are at least to! systems of distinct representatives. 

16. Systems of distinct representatives can be phrased in terms of 0-1 matrices and 
graphs. See §6.6.1, §8.12, and §10.4.3. 


Examples: 

1. {1,2} n {2,3} = {2}. 

2. The collection of sets {1,2}, {4,5}, {6,7,8} is pairwise disjoint, and hence disjoint. 

3. The collection of sets {1, 2}, {2, 3}, {1, 3} is disjoint, but not pairwise disjoint. 

4. {1, 2} U {2, 3} = {1,2, 3}. 

5. Suppose that for every positive integer n, [j mod n] = { k £ Z \ k mod n = j }, for 
j = 0, 1, . . . , n— 1. (See §1.3.1.) Then { [0 mod 3], [1 mod 3], [2 mod 3] } is a partition 
of the integers. Moreover, { [0 mod 6], [1 mod 6], . . . , [5 mod 6] } is a refinement of 
this partition. 

6. Within the context of Z as universal domain, the complement of the set of positive 
integers is the set consisting of the negative integers and 0. 

7. {1,2} - {2,3} = {1}. 

8. {1,2} x {2,3} = {(1,2), (1,3), (2, 2), (2, 3)}. 

9. 7>({1,2}) = {0,{1},{2},{1,2}}. 

10. If L is a line in the plane, and if for each x € L, C x is the circle of radius 1 centered 
at point x, then {J xGL C x is an infinite strip of width 2, and f] xeL C x = 0. 

11. The five-fold Cartesian product {0, l} 5 contains 32 different 5-tuples, including, 
for instance, (0, 0, 1, 0, 1). 

12. The set identity AnB = AUB is verified by the following membership table. 
Begin by listing the possibilities for elements being in or not being in the sets A and B, 
using 1 to mean “is an element of” and 0 to mean “is not an element of”. Proceed to 
find the element values for each combination of sets. The two sides of the equation are 
the same since the columns for A ft B and AU B are identical: 


A B 

AnB An B A B AUB 

1 1 

1 0 

0 1 

0 0 

1 0 0 0 0 

0 10 11 

0 110 1 

0 1111 


13. The collection of sets A\ = {1,2}, A 2 = {2,3}, A 3 = {1,3,4} has systems of 
distinct representatives, for example {1,2,3} and {2,3,4}. 

14. The collection of sets A\ = {1,2}, A 2 = {1,3}, A 3 = {2,3}, A 4 = {1,2,3}, A 5 = 
{2,3,4} does not have a system of distinct representatives since | A 3 U A 2 U A 3 U A 4 1 <4. 
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1.2.3 INFINITE SETS 


Definitions: 

The Peano definition for the natural numbers AT: 

• 0 is a natural number; 

• every natural number n has a successor s(n); 

• axioms: 

o 0 is not the successor of any natural number; 
o two different natural numbers cannot have the same successor; 
o if 0 £ T and if (Vn € A f) [(n € T) — > (s(n) € T)] , then T = J\f. 

(This axiomatization is named for Giuseppe Peano, 1858-1932.) 

A set is denumerable (or countably infinite) if it can be put in a one-to-one corre- 
spondence with the set of natural numbers {0, 1, 2,3,...}. (See §1.3.1.) 

A countable set is a set that is either finite or denumerable. All other sets are un- 
countable. 

The ordinal numbers (or ordinals ) are defined recursively as follows: 

• the empty set is the ordinal number 0; 

• if a is an ordinal number, then so is the successor of a, written a + or a + 1, 

which is the set a U {a}; 

• if (3 is any set of ordinals closed under the successor operation, then (3 is an 

ordinal, called a limit ordinal. 

The ordinal a is said to be less than the ordinal /?, written a < f3, if a C (3 (which is 
equivalent to a € f3). 

The sum of ordinals a and (3, written a + (3, is the ordinal corresponding to the well- 
ordered set given by all the elements of a in order, followed by all the elements of (3 
(viewed as being disjoint from a) in order. (See Fact 26 and §1.4.3.) 

The product of ordinals a and (3, written a ■ (3, is the ordinal equal to the Cartesian 
product a x f3 with ordering (ai, b\) < (<22, £>2) whenever b\ < 62, or b\ = 62 and a\ < <12 
(this is reverse lexicographic order). 

Two sets have the same cardinality (or are eq uinumerous) if they can be put into 
one-to-one correspondence (§1.3.1.). When the equivalence relation “equinumerous” is 
used on all sets (see §1.4.2.), the sets in each equivalence class have the same cardinal 
number. The cardinal number of a set A is written |A|. It can also be regarded as the 
smallest ordinal number among all those ordinal numbers with the same cardinality. 

An order relation can be defined on cardinal numbers of sets by the rule |A| < B if 
there is a one-to-one function f: A—>B. If |A| < \B\ and |A| ^ \B\, write |A| < \B\. 

The sum of cardinal numbers a and b, written a + b, is the cardinal number of the 
union of two disjoint sets A and B such that \A\ = a and \B\ = b. 

The product of cardinal numbers a and b, written ab, is the cardinal number of the 
Cartesian product of two sets A and B such that |A| = a and |i?| = b. 

Exponentiation of cardinal numbers, written a b , is the cardinality of the set A B of 
all functions from B to A , where |A| = a and \B\ = b. 
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Facts: 

1. Axiom 3 in the Peano definition of the natural numbers is the principle of mathe- 
matical induction. (See §1.5.6.) 

2 . The finite cardinal numbers are written 0, 1, 2, 3, ... . 

3 . The cardinal number of any finite set with n elements is n. 

4 . The first infinite cardinal numbers are written H 0 , H l7 H 2 , . . . , H w , . . . . 

5 . For each ordinal a, there is a cardinal number H a . 

6. The cardinal number of any denumerable set, such as A f , Z , and <2, is Ho- 

7 . The cardinal number of V(A f), TZ, and C is denoted c (standing for the continuum) . 

8. The set of algebraic numbers (all solutions of polynomials with integer coefficients) 
is denumerable. 

9 . The set TZ is uncountable (proved by Georg Cantor in late 19th century, using a 
diagonal argument). (See §1.5.7.) 

10 . Every subset of a countable set is countable. 

11 . The countable union of countable sets is countable. 

12 . Every set containing an uncountable subset is uncountable. 

13 . The continuum problem , posed by Georg Cantor (1845-1918) and restated by 
David Hilbert (1862-1943) in 1900, is the problem of determining the cardinality, \7Z\, 
of the real numbers. 

14 . The continuum hypothesis is the assertion that \1Z\ = Hi, the first cardinal 
number larger than Ho- Equivalently, 2^° = Hi. (See Fact 35.) Kurt Godel (1906-1978) 
proved in 1938 that the continuum hypothesis is consistent with various other axioms of 
set theory. Paul Cohen (born 1934) demonstrated in 1963 that the continuum hypothesis 
cannot be proved from those other axioms; i.e., it is independent of the other axioms of 
set theory. 

15 . The generalized continuum hypothesis is the assertion that 2^“ = H, t+1 for 
all ordinals a. That is, for infinite sets there is no cardinal number strictly between |«S'| 
and |P(S)|. 

16 . The generalized continuum hypothesis is consistent with and independent of the 
usual axioms of set theory. 

17 . There is no largest cardinal number. 

18 . \A\ < l'P(A)! for all sets A. 

19 . Schroder-Bernstein theorem: If |A| < \B\ and \B\ < |A|, then |A| = \B\. (This is 
also called the Cantor-Schroder-Bernstein theorem.) 

20 . The ordinal number 1 = 0 + = {0} = {0}, the ordinal number 2 = 1 + = {0, 1}, etc. 
In general, for finite ordinals, n + 1 = n + = {0, 1, 2, . . . , n}. 

21 . The first limit ordinal is u = {0,1,2,...}. Then w + l = w + = wU {tu} = 
{0, 1,2,..., cu}, and so on. The next limit ordinal is to + u = {0, 1, 2, . . . , u>, co + 1, u> + 
2, . . .}, also denoted u ■ 2. The process never stops, because the next limit ordinal can 
always be formed as the union of the infinite process that has gone before. 

22 . Limit ordinals have no immediate predecessors. 
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23. The first ordinal that, viewed as a set, is not countable, is denoted uo\. 

24. For ordinals the following are equivalent: a < /?, a £ f3, a C /3. 

25. Every set of ordinal numbers has a smallest element; i.e., the ordinals are well- 
ordered. (See §1.4.3.) 

26. Ordinal numbers correspond to well-ordered sets (§1.4.3). Two well-ordered sets 
represent the same ordinal if they can be put into an order-preserving one-to-one cor- 
respondence. 

27. Addition and multiplication of ordinals are associative operations. 

28. Ordinal addition and multiplication for finite ordinals (those less than to) are the 
same as ordinary addition and multiplication on the natural numbers. 

29. Addition of infinite ordinals is not commutative. (See Example 2.) 

30. Multiplication of infinite ordinals is not commutative. (See Example 3.) 

31. The ordinals 0 and 1 are identities for addition and multiplication, respectively. 

32. Multiplication of ordinals is distributive over addition on the left: a((3 + 7 ) = 
af} + ay. It is not distributive on the right. 

33. In the definition of the cardinal number a b , when a = 2, the set A can be taken 
to be A = {0, 1} and an element of A B can be identified with a subset of B (namely, 
those elements of B sent to 1 by the function). Thus 2' B ' = \V(B ) |, the cardinality of 
the power set of B. 

34. If a and b are cardinals, at least one of which is infinite, then a + b = a • b = the 
larger of a and b. 

35. c*° = = 2*° 

36. The usual rules for finite arithmetic continue to hold for infinite cardinal arithmetic 
(commutativity, associativity, distributivity, and rules for exponents). 


Examples: 

1. C0± > CO ■ 2, OJi > CO 2 , LO\ > u A. 

2 . 1 + co = co, but co + 1 > co. 

3. 2 ■ to = to, but co • 2 > to. 

4. H 0 -H 0 = H 0 + H 0 = H 0 . 


1 .2.4 AXIOMS FOR SET THEORY 

Set theory can be viewed as an axiomatic system, with undefined terms “set” (the 
universe of discourse) and “is an element of” (a binary relation denoted €). 

Definitions: 

The Axiom of choice (AC) states: If A is any set whose elements are pairwise disjoint 
nonempty sets, then there exists a set X that has as its elements exactly one element 
from each set in A. 
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The Zermelo-Fraenkel ( ZF ) axioms for set theory: (The axioms are stated infor- 
mally.) 

• Extensionality (equality): Two sets with the same elements are equal. 

• Pairing: For every a and b, the set {a, b} exists. 

• Specification (subset): If A is a set and P(x) is a predicate with free variable x, 

then the subset of A exists that consists of those elements c £ A such that P(c) 
is true. (The specification axiom guarantees that the intersection of two sets 
exists.) 

• Union: The union of a set (i.e., the set of all the elements of its elements) exists. 

(The union axiom together with the pairing axiom implies the existence of the 
union of two sets.) 

• Power set: The power set (set of all subsets) of a set exists. 

• Empty set: The empty set exists. 

• Regularity (foundation): Every nonempty set contains a “foundational” element; 

that is, every nonempty set contains an element that is not an element of any 
other element in the set. (The regularity axiom prevents anomalies such as a 
set being an element of itself.) 

• Replacement: If / is a function defined on a set A, then the collection of images 

{/(a) | a £ A } is a set. The replacement axiom (together with the union 
axiom) allows the formation of large sets by expanding each element of a set 
into a set. 

• Infinity: An infinite set, such as uj (§1.2.3), exists. 


Facts: 

1. The axiom of choice is consistent with and independent of the other axioms of set 
theory; it can be neither proved nor disproved from the other axioms of set theory. 

2 . The axioms of ZF together with the axiom of choice are denoted ZFC. 

3 . The following propositions are equivalent to the axiom of choice: 

• The well-ordering principle: Every set can be well-ordered; i.e., for every set A 

there exists a total ordering on A such that every subset of A contains a smallest 
element under this ordering. 

• Generalized axiom of choice (functional version): If A is any collection of non- 

empty sets, then there is a function / whose domain is A , such that f(X) £ X 
for all X £ A. 

• Zorn ’s lemma: Every nonempty partially ordered set in which every chain (totally 

ordered subset) contains an upper bound (an element greater than all the other 
elements in the chain) has a maximal element (an element that is less than no 
other element). (§1.4.3.) 

• The Hausdorff maximal principle : Every chain in a partially ordered set is con- 

tained in a maximal chain (a chain that is not strictly contained in another 
chain). (§1.4.3.) 

• Trichotomy: Given any two sets A and B , either there is a one-to-one function 

from A to B , or there is a one-to-one function from B to A; i.e., either \ A\ < \B\ 
or \B\ < \A\. 
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1.3 FUNCTIONS 

A function is a rule that associates to each object in one set an object in a second set 
(these sets are often sets of numbers). For instance, the expected population in future 
years, based on demographic models, is a function from calendar years to numbers. 
Encryption is a function from confidential information to apparent nonsense messages, 
and decryption is a function from apparent nonsense back to confidential information. 
Computer scientists and mathematicians are often concerned with developing methods 
to calculate particular functions quickly. 


1 .3.1 BASIC TERMINOLOGY FOR FUNCTIONS 
Definitions: 

A function f from a set A to a set B , written /: A — > B, is a rule that assigns to every 
object a £ A exactly one element /(a) £ B. The set A is the domain of /; the set B 
is the codomain of /; the element /(a) is the image of a or the value of / at a. A 
function / is often identified with its graph { ( a,b ) | a £ A and b = f(a) } C A x B. 

Note : The function /: A — ■> B is sometimes represented by the “maps to ” notation 
x i— > f(x) or by the variation x i— > expr(x), where expr(x) is an expression in x. The 
notation f(x) = expr(x) is a form of the “maps to” notation without the symbol i— >. 

The rule defining a function /: A — > B is called well-defined since to each a £ A there 
is associated exactly one element of B. 

If f:A—>B and S C A, the image of the subset S under / is the set f(S) = { f(x) \ 
x £ S }. 

If /: A — » Z? and T C B, the pre-image or inverse image of the subset T under / is 
the set / -1 (T) = { x \ f(x) € T}. 

The image of a function /: A — > B is the set /(A) = { fix) \ x £ A }. 

The range of a function /: A — > B is the image set /(A). (Some authors use “range” 
as a synonym for “codomain”.) 

A function /: A — > B is one-to-one (1—1, injective , or a monomorphism) if distinct 
elements of the domain are mapped to distinct images; i.e., fia i) f{a 2 ) whenever 
a 1 ci 2 ■ An injection is an injective function. 

A function /: A — > /j is onto ( surjective , or an epimorphism ) if every element of 
the codomain B is the image of at least one element of A; i.e., if (V6 £ B)(3a £ A) 
[/(a) = b\ is true. A surjection is a surjective function. 

A function f:A—>B is bijective (or a one-to-one correspondence ) if it is both 
injective and surjective; i.e., it is 1-1 and onto. A bijection is a bijective function. 

If /: A — > B and SC A, the restriction of / to S is the function fs '■ S — > B where 
fs(x) = f(x) for all x £ S. The function / is an extension of fg. The restriction of / 
to S is also written f\g. 

A partial function on a set A is a rule / that assigns to each element in a subset of A 
exactly one element of B. The subset of A on which / is defined is the domain of 
definition of /. In a context that includes partial functions, a rule that applies to all 
of A is called a total function. 
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Given a 1-1 onto function f:A—>B, the inverse function f 1 : B — > A has the rule 
that for each y £ B, f~ 1 (y) is the object x £ A such that f(x ) = y. 

If f:A—>B and g: B — > C, then the composition is the function gof: A — > C defined 
by the rule (gof)(x) = g(f(x)) for all x € A. The function to the right of the raised 
circle is applied first. 

Note : Care must be taken since some sources define the composition {gof)(x) = f(g( x)) 
so that the order of application reads left to right. 

If f: A — > A , the iterated functions f n : A — > A (n > 2) are defined recursively by the 
rule f n (x) = f o f n -\x). 

A function /: A — > A is idempotent if / o / = /. 

A function f: A — > A is an involution if / o / = ©. (See Example 1.) 

A function whose domain is a Cartesian product A\ x • • • x A n is often regarded as 
a function of n variables (also called a multivariate function), and the value of / at 
(ai, . . . , a„) is usually written /(cti, . . . , a n ). 

An (n-ary) operation on a set A is a function /: A n — > A , where A n = A x • • • x A 
(with n factors in the product). A 1-ary operation is called monadic or unary, and a 
2-ary operation is called binary. 

Facts: 

1. The graph of a function /: A — > B is a binary relation on Ax B. (§1.4.1.) 

2. The graph of a function f: A — > B is a subset S of A x B such that for each a £ A 
there is exactly one b £ B such that (a, b) £ S. 

3. In general, two or more different objects in the domain of a function might be 
assigned the same value in the codomain. If this occurs, the function is not 1—1. 

4. If f:A —> B is bijective, then: /o/ _1 = Ib (Example 1), f~ 1 of = i a, f~ 1 is 
bijective, and (/ -1 ) -1 = /. 

5. Function composition is associative: ( f°g)°h = fo(goh), whenever h: A — > B , 
g\B—^C , and /: C — > D. 

6. Function composition is not commutative; that is, fog ^ gof in general. (See 
Example 12.) 

7. Set operations with functions : If /: A — » B with Si, S 2 C A and Tj, T-i C B , then: 

• /(Si U S 2 ) = f(Si) U /(S 2 ); 

• /(Si fl S 2 ) C f(S\) n /(S 2 ), with equality if / is injective; 

• /(Si) 2 /(Si) (i.e., f(A - Si) D B - /(Si)), with equality if / is injective; 

. / _1 (Ti U T 2 ) = / -1 (Ti) U r\T 2 y, 

. / _1 (Ti n T 2 ) = / -1 (Ti) n f~ 1 (T 2 ); 

. / _1 (7i) = (FW (i.e., /^(i? -Ti) = A- / -1 (Xj)); 

• /-'(/(Si)) A Si, with equality if / is injective; 

• /(/ -1 (T 1 )) C Ti, with equality if / is surjective. 

8. If f: A —+ B and g: B C are both bijective, then (g o /) _1 = f~ 1 o g -1 . 

9. If an operation * (such as addition) is defined on a set B , then that operation can be 
extended to the set of all functions from a set A to B, by setting (f*g)(x) = f(x)*g(x). 
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10. Numbers of functions: If \A\ = m and \B\ — n, the numbers of different types of 
functions /: A — > B are given in the following list: 

• all: n m (§2.2.1) 

• one-to-one: P(n, m) = n(n — l)(n — 2) . . . (n — m + 1) if n > m (§2.2.1) 

• onto: E”=o(~ 1 ) J (")( n “ j) m if m > n (§2.4.2) 

• partial: (n+ l) m (§2.3.2) 

Examples: 

1. The following are some common functions: 

• exponential function to base b (for b > 0, b / 1): the function /: R — > R , 1 

where f(x) = b x . (See the following figure.) (1Z + is the set of positive real 
numbers.) 

• logarithm function with base b (for b > 0. /> / 1): the function log,/ R, " — > R, 

that is the inverse of the exponential function to base b; that is, 

log b x = y if and only if b y = x. 

• common logarithm function: the function log 1(l : R, — > R, (also written log) 

that is the inverse of the exponential function to base 10; i.e., log 10 x = y when 
10 y = x. (See the following figure.) 

• binary logarithm function: the function log 2 : R + — » R. (also denoted log 

or lg) that is the inverse of exponential function to base 2; i.e., log 2 x — y when 
2 V = x. (See the following figure.) 

• natural logarithm function: the function In: R~ — > R, is the inverse of 

the exponential function to base e; i.e., ln(a;) = y when e y = x, where e = 
lim„^ 00 (l + i) n ss 2.718281828459. (See the following figure.) 



smallest nonnegative integer k such that log^ x < 1; the function log*-^ is 
defined recursively by 

( x if k = 0 

log (fe) x = < log(log^ -1 ) x) if log (fc_1) x is defined and positive 
l undefined otherwise. 

• mod function: for a given positive integer n, the function f:Z — > A f defined by 

the rule f{k) = k mod n, where k mod n is the remainder when the division 
algorithm is used to divide k by n. (See §4.1.2.) 

• identity function on a set A : the function ijy. A — » A such that Ia(x) = x for 

all x € A. 


© 2000 by CRC Press LLC 



• characteristic function of S: for S C A, the function x s : A — > {0, 1} given by 

X s (%) = 1 if x € S and Xs ( x ) = 0 if x £ S. 

• projection function: the function Try A\ x ••• x A n — » © (j = l,2,...,n) 

such that 7Tj(ai, . . . , a„) = dj. 

• permutation: a function f:A—>A that is 1-1 and onto. 

• floor function (sometimes referred to, especially in number theory, as the great- 

est integer function) : the function |_ J :7Z Z where = the greatest 
integer less than or equal to x. The floor of x is also written [x]. (See the 
following figure.) Thus [yrj =3, |_6J =6, and [ — 0.2J = —1. 

• ceiling function: the function [ ]:TZ — > Z where far] = the smallest integer 

greater than or equal to x. (See the following figure.) Thus f 7r] = 4, [6] = 6, 
and f— 0.2] = 0. 




2. The floor and ceiling functions are total functions from the reals 1Z to the integers Z. 
They are onto, but not one-to-one. 

3. Properties of the floor and ceiling functions (m and n represent arbitrary integers): 

• [x\ = n if and only if n < x < n + 1 if and only if x — 1 < n < x; 

• far] = n if and only if n — 1 < x < n if and only if x < n < x + 1; 

• fxj < n if and only if x < n\ far] < n if and only if x < n; 

• n < fxj if and only if n < x; n < far] if and only if n < x; 

• x — 1 < fxj < x < far] < x + 1; 

• fxj = x if and only if x is an integer; 

• far] = x if and only if x is an integer; 

• L-^J = -M; r-^1 = -|arj; 

• \_x + n\ = [_arj + n; \x + n] = |"ar] + n; 

• the interval [xi,X 2 ] contains [^ 2 ] — |"ari] + 1 integers; 

• the interval [xi,X 2 ) contains [X 2 ] — faq] integers; 

• the interval (ar^aq] contains [^ 2 ] — |_cci J integers; 

• the interval (xi,X 2 ) contains \x 2 ~\ — L^iJ — 1 integers; 

• if f(x) is a continuous, monotonically increasing function, and whenever f(x) is 

an integer, x is also an integer, then [f(x ) J = [/(L^J)] and | f(x)) = [/([x])]; 

• if n > 0, then = |_ j and l" 2 ^] = |~ ^ X ^ TO ] (a special case of the 

preceding fact); 

• if m > 0, then |_mxj = \_x\ + [x + + \- [x + 

4. The logarithm function log b x is bijective from the positive reals 1Z + to the reals 1Z. 
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5. The logarithm function x i— > log & ;r is the inverse of the function x i— > b x , if the 
codomain of x e- > b x is the set of positive real numbers. If the domain and codomain are 
considered to be TZ, then x e- > log b x is only a partial function, because the logarithm of 
a nonpositive number is not defined. 


6. All logarithm functions are related according to the following change of base formula: 


log b x 


log a X 
log a b ' 


7. log* 2 = 1, log* 4 = 2, log* 16 = 3, log* 65536 = 4, log* 2 65536 = 5. 


8. The diagrams in the following figure illustrate a function that is onto but not 1-1 
and a function that is 1-1 but not onto. 



onto, not 1-1 1-1, not onto 


9. If the domain and codomain are considered to be the nonnegative reals, then the 
function x i— > x 2 is a bijection, and x i— > yfx is its inverse. 

10. If the codomain is considered to be the subset of complex numbers with polar 
coordinate 0 < 6 < tt, then x e- > yfx can be regarded as a total function. 

11. Division of real numbers is a multivariate function from TZ x (7 Z — {0}) to 7 Z, 
given by the rule f(x,y) = A Similarly, addition, subtraction, and multiplication are 
functions from 7Z x 7Z to TZ. 

12. If f(x) = x 2 and g(x) = x + 1, then (/ o g)(x) = (x + l) 2 and ( g o f)(x) = x 2 + 1. 
(Therefore, composition of functions is not commutative.) 

13. Collatz conjecture : If /: {1, 2, 3, . . .} — > {1, 2, 3, . . .} is defined by the rule f(n) = § 
if n is even and f(n) = 3n + 1 if n is odd, then for each positive integer m there is a 
positive integer k such that the iterated function f k (m) = 1. It is not known whether 
this conjecture is true. 


1.3.2 COMPUTATIONAL REPRESENTATION 

A given function may be described by several different rules. These rules can then 
be used to evaluate specific values of the function. There is often a large difference 
in the time required to compute the value of a function using different computational 
rules. The speed usually depends on the representation of the data as well as on the 
computational process. 

Definitions: 

A ( computational ) representation of a function is a way to calculate its values. 

A closed formula for a function value f(x) is an algebraic expression in the argument x. 

A table of values for a function f:A—>B with finite domain A is any explicit 
representation of the set { (a, /(a)) eAxBjaGA}. 
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An infinite sequence in a set S' is a function from the natural numbers {0, 1,2, . . .} 
to the set S. It is commonly represented as a list Xo, xi, X 2 , ■ ■ ■ such that each Xj € S. 
Sequences are often permitted to start at the index 1 or elsewhere, rather than 0. 

A finite sequence in a set S is a function from {1, 2, . . . , n} to the set S. It is commonly 
represented as a list x%, X 2 , • • • , x n such that each Xj € S. Finite sequences are often 
permitted to start at the index 0 (or at some other value of the index), rather than at 
the index 1. 


A value of a sequence is also called an entry , an item, or a term. 

A string is a representation of a sequence as a list in which the successive entries are 
juxtaposed without intervening punctuation or extra spacing. 


A recursive definition of a function / with domain S is given in two parts: there is 
a set of base values (or initial values) B on which the value of / is specified, and 
there is a rule for calculating /( x) for every x € S — B in terms of previously defined 
values of /. 


Ackermann’s function (Wilhelm Ackermann, 1896-1962) is defined recursively by 


A(x,y,z ) 


x + y 
0 
1 
X 

A(x,A(x,y 


if 2 = 0 
if y = 0, z = 1 
if y = 0, 2 = 2 
if y = 0, 2 > 2 
1, z), z — 1) if y, z > 0. 


An alternative version of Ackermann’s function, with two variables, is defined recursively 
by 

{ n + 1 if to = 0 

A(m — 1,1) if to > 0, n = 0 

A(m — 1, A(m, n — 1)) if to, n > 0. 

Another alternative version of Ackermann’s function is defined recursively by the rule 
A(n) = A n (n ), where Ai(n) = 2 n and A m (n) = A^ ) ”l 1 ( 1) if m > 2. 


The (input-independent) halting function maps computer programs to the set {0,1}, 
with value 1 if the program always halts, regardless of input, and 0 otherwise. 


Facts: 

1. If f:Af — > 1Z is recursively defined, the set of base values is frequently the set 
{/( 0), /( 1), . . . , f(j)} and there is a rule for calculating f(n) for every n > j in terms 
of f(i) for one or more i < n. 

2. There are functions whose values cannot be computed. (See Example 5.) 

3. There are recursively defined functions that cannot be represented by a closed for- 
mula. 

4. It is possible to find closed formulas for the values of some functions defined recur- 
sively. See Chapter 3 for more information. 

5. Computer software developers often represent a table as a binary search tree (§17.2). 

6. In Ackermann’s function of three variables A(x, y,z), as the variable z ranges from 0 
to 3, A(x, y, z) is the sum of x and y, the product of x and y, x raised to the exponent y, 
and the iterated exponentiation of x y times. That is, A(x, y, 0) = x + y , A(x, y, 1) = xy, 

,x 

A(x, y , 2) = x y , A{x, y, 3) = x x (y xs in the exponent). 
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7. The version of Ackermann’s function with two variables, A(x,y ), has the following 
properties: A( 1, n) = n + 2, A( 2, n) = 2 n + 3, A(3, n) = 2 n+3 — 3. 

8. A(m, n) is an example of a well-defined total function that is computable, but not 
primitive recursive. (See §16.) 

Examples: 

1. The function that maps each month to its ordinal position is represented by the 
table 

{(Jan, 1), (Feb, 2),..., (Dec, 12)}. 

2. The function defined by the recurrence relation 

/( 0) = 0; f(n ) = f(n — 1) + 2n — 1 for n > 1 

has the closed form f(x ) = x 2 . 

3. The function defined by the recurrence relation 

/( 0) = 0, /( 1) = 1; f(n) = f(n - 1) + f(n - 2) for n > 2 
generates the Fibonacci sequence 0, 1, 1, 2, 3,5,8,... (see §3.1.2) and has the closed form 

/<„) = q + v'sr-a-v'sr . 

' ’ 2”C5 

4. The factorial function n! is recursively defined by the rules 

0! = 1; n! = n • (n — 1)! for n > 1. 

It has no known closed formula in terms of elementary functions. 

5. It is impossible to construct an algorithm to compute the halting function. 

6. The halting function from the Cartesian product of the set of computer programs 
and the set of strings to {0, 1} whose value is 1 if the program halts when given that 
string as input and 0 if the program does not halt when given that string as input is 
noncomput able . 

7. The following is not a well-defined function /: {1, 2, 3, . . .} — > {1,2,3,...} 

( 1 if n = 1 

f(n) = < 1 + /(f) if n is even 

[ /(3n — 1) if n is odd, n > 1 

since evaluating /( 5) leads to the contradiction /( 5) = /( 5) + 3. 

8. It is not known whether the following is a well-defined function /: {1, 2, 3, . . .} — > 
{1,2,3,...} 

( 1 n = 1 

f(n) = l l + /(f) n even 

l f(3n + 1) n odd, n > 1. 

(See §1.3.1, Example 13.) 


1.3.3 ASYMPTOTIC BEHAVIOR 

The asymptotic growth of functions is commonly described with various special pieces 
of notation and is regularly used in the analysis of computer algorithms to estimate the 
length of time the algorithms take to run and the amount of computer memory they 
require. 
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Definitions: 

A function /: 1Z — * 1Z or f: Af — > 7?. is bounded if there is a constant k such that 
|/(x)| < k for all x in the domain of /. 

For functions /, g: 7?. — > 7?. or /, g: A/" — > 77 (sequences of real numbers) the following are 
used to compare their growth rates: 

• / is big-oh of g ( g dominates /) if there exist constants C and k such that 

| f(x) | < C\g(x) | for all x > k. 

Notation: f is 0(g), /( x) G 0(g(x)), f G O(g), f = 0(g). 

• f is little-oh of g if linx^oo | | = 0; i.e., for every C > 0 there is a constant k 

such that \f{x)\ < C\g(x)\ for all x > k. 

Notation: / is o(g), f(x) G o(g(x)), f G o{g ), / = o(g). 

• / is big omega of g if there are constants C and k such that \g(x) \ < C\f(x)\ 

for all x > k. 

Notation: / is Q (g), f(x) G fl(g(x)), f G O (g), f = f 1(g). 

• f is little omega of g if lim^^oo | = 0. 

Notation: / is u(g), /( x) G u>(g(x)), f G w(g), f = u>(g). 

• f is theta of q if there are positive constants Ci , Co, and k such that C i |o(a;)| < 

|/(*)| < C 2 \g{x)\ for all x > k. 

Notation: f is Q(g), f(x) G 0(g(x)), f G 0(g) , / = 0(g), / « g. 

• / is asymptotic to g if linx^oo — 1. This relation is sometimes called 

asymptotic equality. 

Notation: f ~ g, f(x) ~ g(x). 

Facts: 

1. The notations 0( ), o( ), 0( ), w( ),and0( ) all stand for collections of functions. 
Hence the equality sign, as in / = 0(g ), does not mean equality of functions. 

2. The symbols O(g), o(g), fl(g), u>(g), and 0(c/) are frequently used to represent 
a typical element of the class of functions it represents, as in an expression such as 
f(n) = n log n + o(n). 

3. Growth rates: 

• O(g): the set of functions that grow no more rapidly than a positive multiple 

of g; 

• o{g): the set of functions that grow less rapidly than a positive multiple of g\ 

• fl(g): the set of functions that grow at least as rapidly as a positive multiple of g\ 

• uj(g): the set of functions that grow more rapidly than a positive multiple of g; 

• 0(g): the set of functions that grow at the same rate as a positive multiple of g. 

4. Asymptotic notation can be used to describe the growth of infinite sequences, since 
infinite sequences are functions from {0, 1,2,.. .} or {1, 2, 3, . . .} to TZ (by considering 
the term a n as a(n), the value of the function a(n) at the integer n). 

5. The big-oh notation was introduced in 1892 by Paul Bachmann (1837-1920) in the 
study of the rates of growth of various functions in number theory. 

6. The big-oh symbol is often called a Landau symbol , after Edmund Landau (1877- 
1938), who popularized this notation. 
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7 . Properties of big-oh: 

• if / € O(g) and c is a constant, then cf G O(g): 

• if fi,h G 0 (g), then /i + / 2 G 0 (g); 

• if fi G O(gi) and / 2 € 0 (^ 2 ), then 

o (A + / 2 ) g 0(51 + 52 ) 

O (/1 + h) G 0(max(|5i|, |c/ 2 |)) 
o (/1/2) G 0(5 i5 2 ); 

• if / is a polynomial of degree n, then / G O©"); 

• if / is a polynomial of degree m and g a polynomial of degree ?r, with m > n, 

then ^ G 0 (x m ~ n ); 

• if / is a bounded function, then / G 0 ( 1 ); 

• for all a, b > 1, 0(log a ;r) = 0(log b :r); 

• if / G 0 (g) and \h(x)\ > | r/(a:) | for all x > k, then / G 0 (h ); 

• if / G 0 (x m ), then / G 0 (x n ) for all n > m. 

8. Some of the most commonly used benchmark big-oh classes are: 0 ( 1 ), 0 (log:r), 
0 (x), 0 (x logs), 0 (x 2 ), 0 ( 2 X ), 0 (x\), and 0 (x x ). If / is big-oh of any function in this 
list, then / is also big-oh of each of the following functions in the list: 

0 ( 1 ) C 0 (loga:) C O(x) C O(xlogx) C 0 (x 2 ) C 0 ( 2 X ) c 0 (x\) C 0 (x x ). 

The benchmark functions are drawn in the following figure. 



9 . Properties of little-oh: 

• if / G 0(5), then cf G 0(5) for all nonzero constants c; 

• if /1 G 0(5) and / 2 G 0(5), then /1 + / 2 G 0(5); 

• if /1 G 0(51) and / 2 G o(5 2 ), then 

« (/1 + /2) G 0(51 + g 2 ) 

0 (/1 + /2) G o(max(|5i|, | <72 1 ) ) 

« (/1/2) G o(5i5 2 ); 

• if / is a polynomial of degree m and g a polynomial of degree n with m < n, then 

1 g o(l);' 

• the set membership f(x) G L + o(l) is equivalent to f(x) — > L as a: — > 00, where 

L is a constant. 
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10 . If / G o(g), then / G O(g); the converse is not true. 

11 . If / G O(g) and h G o(/), then h € o(g). 

12 . If / G o{g ) and h G O(f), then /i G O(g). 

13 . If / G 0(g ) and h G O(f), then h G 0{g). 

14 . If A G o(gi) and / 2 G 0(g 2 ), then f 1 f 2 G o(gig 2 ). 

15 . / G 0(g) if and only if g G fl(/). 

16 . / G 0(g) if and only if / G O(g) and g G 0(f). 

17 . / G 0(g) if and only if / G O(g) and / G fl(g). 

18 . If /(a;) = a„a: n + • • • + a\X + ao (a„ A 0), then / ~ a n x n . 

19 . / ~ g if and only if — l) G o(l) (provided g(x) = 0 only finitely often). 


Examples: 

5a; 8 + 10 2OO a; 5 + 3a; + 1 G 0(x 8 ). 
x 3 G 0(a; 4 ), a; 4 ^ 0(a; 3 ). 
x 3 G o(a 4 ), a: 4 ^ o(x 3 ). 
x 3 £ o(a 3 ). 

x 2 G 0(5a; 2 ); x 2 ^ o(5a; 2 ). 
sin(a;) G 0(1). 

ffeff e 0(a; 4 ); §||f G 0(a© 

8. 1 + 2 + 3-1 h»iG 0(n 2 ). 

9. 1 + 3 + 5 + ---+S GO(logn). 

10 . log(n!) G O(nlogn). 

11 . 8a; 5 G 0(3a; 5 ). 

12 . a; 3 G 0(a; 2 ). 

13 . 2" + o(n 2 ) ~ 2+ 

14 . Sometimes asymptotic equality does not behave like equality: In n ~ ln(2?r), but 
n / 'Q n and Inn — Inn /dn(2n) — Inn. 

15 . 7r(n) ~ where n (n) is the number of primes less than or equal to n. 

16 . If p n is the nth prime, then p n ~ nlnn. 

17 . Stirling’s formula: n\ ~ v / 27rn(© n . 


1.4 RELATIONS 

Relationships between two sets (or among more that two sets) occur frequently through- 
out mathematics and its applications. Examples of such relationships include integers 
and their divisors, real numbers and their logarithms, corporations and their customers, 
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cities and airlines that serve them, people and their relatives. These relationships can 
be described as subsets of product sets. 

Functions are a special type of relation. Equivalence relations can be used to describe 
similarity among elements of sets and partial order relations describe the relative size 
of elements of sets. 


1 .4.1 BINARY RELATIONS AND THEIR PROPERTIES 
Definitions: 

A binary relation from set A to set B is any subset R of A x B. 

An element a £ A is related to b £ B in the relation R if (a, b) £ R , often written 
aRb. If (a, b) £ R , write al/ib. 

A binary relation ( relation ) on a set A is a binary relation from A to A; i.e., a subset 
of A x A. 

A binary relation R on A can have the following properties (to have the property, the 
relation must satisfy the property for all a, 6, c £ A): 

• redexivity: aRa 

• ir reflexivity: al/ta 

• symmetry : if a lib. then bRa 

• asymmetry : if aRb , then b lfi a 

• antisymmetry: if aRb and bRa , then a = b 

• transitivity : if aRb and bRc, then aRc 

• intransitivity: if aRb and bRc , then al/ic 

Binary relations R and S from A to B can be combined in the following ways to yield 
other relations: 

• complement of R: the relation R from A to B where aRb if and only if al/ib 

(i.e., ->(aRb)) 

• difference : the binary relation R — S from A to I? such that a(R — S) b if and 

only if aRb and ~^(aSb) 

• intersection: the relation R n S from A to B where a(R. ft S)b if and only if 

aRb and aSb 

• inverse (converse): the relation R~ x from B to A where bR~ 1 a if and only if 

aRb 

• symmetric difference: the relation R(B S from A to B where a(R(B S)b if and 

only if exactly one of the following is true: aRb , aSb 

• union: the relation R U S from A to B where a(R U S)b if and only if aRb or 

aSb. 

The closure of a relation R with respect to a property V is the relation S, if it exists, 
that has property V and contains R , such that A is a subset of every relation that has 
property V and contains R. 

A relation R on A is connected if for all a,b £ A with a ^ b, either aRb or there are 
Ci, C2, . . . , Cfc £ A such that aRc\, C\Rc 2 , ■ • ■ , Ck-iRck, c^Rb. 
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If R is a relation on A, the connectivity relation associated with R is the relation R' 
where aR'b if and only if ciRb or there are C\,C 2 , ■ ■ ■ ,Ck £ A such that aRc i, c\Rc 2 , . . . , 
Ck-iRck, CkRb. 

If R is a binary relation from A to B and if S' is a binary relation from B to C, then 
the composition of R and S is the binary relation S o R from A to C where a(S o R)c 
if and only if there is an element b £ B such that aRb and bSc. 

The nth power ( n a nonnegative integer) of a relation R on a set A, is the relation R n , 
where R° = { (a, a) | a € A } = I a (see Example 4), R 1 = R and R n = R n ~ l o R for all 
integers n > 1. 

A transitive reduction of a relation, if it exists, is a relation with the same transitive 
closure as the original relation and with a minimal superset of ordered pairs. 


Notation: 

1. If a relation R is symmetric, aRb is often written a ~ b, a as b, or a = b. 

2. If a relation R is antisymmetric, aRb is often written a < &, a < b, a C b, a C 6, 
a A 6, a -< b, or a C b. 

Facts: 

1. A binary relation R from A to B can be viewed as a function from the Cartesian 
product Ax B to the boolean domain {TRUE, FALSE} (often written {T,F}). The 
truth value of the pair (a, b ) determines whether a is related to b. 

2. Under the infix convention for a binary relation, aRb (a is related to b) means 
R(a , b) = TRUE; al/ib ( a is not related to b) means R(a, b) = FALSE. 

3. A binary relation R from A to B can be represented in any of the following ways: 

• a set R C A x B, where (a, b) £ R if and only if aRb (this is the definition of i?); 

• a directed graph Dr whose vertices are the elements of AU B, with an edge from 

vertex a to vertex b if aRb (§8.3.1); 

• a matrix (the adjacency matrix for the directed graph Dr): if A = {di, . . . , a m } 

and B = {bi, ... ,b n }, the matrix for the relation R is the m x n matrix Mr 
with entries rn t j where m,j = 1 if aiRbj and = 0 otherwise. 

4. R is a reflexive relation on A if and only if { (a, a) | a £ A } C R; i.e. , R is a reflexive 
relation on A if and only if I a C R. 

5. R is symmetric if and only if R = R~ x . 

6. R is an antisymmetric relation on A if and only if R D i? -1 C { (a, a) | a £ A }. 

7. R is transitive if and only if R o R C R. 

8. A relation R can be both symmetric and antisymmetric. See the first example in 
Table 2. 

9. For a relation R that is both symmetric and antisymmetric: R is reflexive if and 
only if R is the equality relation on some set; R is irreflexive if and only if R = 0. 

10. The closure of a relation R with respect to a property V is the intersection of all 
relations Q with property V such that R C Q, if there is at least one such relation Q. 

11. The transitive closure of a relation R is the connectivity relation R' associated 
with R , which is equal to the union (Jili -R 1 of all the positive powers of the relation. 

12. A transitive reduction of a relation may contain pairs not in the original relation 
(Example 8). 
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13 . Transitive reductions are not necessarily unique (Example 9). 

14 . If R is a relation on A and x,y £ A with x ^ y, then x is related to y in the 
transitive closure of R if and only if there is a nontrivial directed path from x to y in 
the directed graph Z?r of the relation. 

15 . The following table shows how to obtain various closures of a relation and gives 
the matrices for the various closures of a relation R with matrix Mr on a set A where 
\ A \ = n - 


relation 

set matrix 

reflexive closure 

symmetric closure 

transitive closure 

R U { (a, a) | a € A } Mr V I n 

RUR- 1 Mr V Mr-i 

ur=i R l Mr V m|j 21 v • • • v m| j" 1 


The matrix I n is the n x n identity matrix, M jj is the zth boolean power of the ma- 
trix Mr for the relation R, and V is the join operator (defined by 0 V 0 = 0 and 
0V1 = 1V0 = 1V1 = 1). 

16 . The following table provides formulas for the number of binary relations with 
various properties on a set with n elements. 


type of relation 

number of relations 

all relations 

2 ™ 2 

reflexive 

2^(71— 1) 

symmetric 

2 n(n+l )/2 

transitive 

no known simple closed formula (§3.1.7) 

antisymmetric 

2 n ' 0n(n— 1)/2 

asymmetric 

^n(n— 1)/2 

irreflexive 

Qn{n— 1) 

equivalence (§1.4.2) 

B n = Bell number = Y,'k=i { k } w ^ere { " } 


is a Stirling subset number (§2.4.2) 

partial order (§1.4.3) 

no known simple closed formula (§3.1.7) 


Algorithm: 

1. Warshall’s algorithm, also called the Roy-Warshall algorithm (B.Roy and S.War- 
shall described the algorithm in 1959 and 1960, respectively), Algorithm 1, is an algo- 
rithm of order n 3 for finding the transitive closure of a relation on a set with n elements. 
(Stephen Warshall, born 1935) 


Algorithm 1 : Warshall’s algorithm. 

input: M = [rriij] nxn = the matrix representing the binary relation R 
output: M = the transitive closure of relation R 

for k := 1 to n 
for i := 1 to n 
for j := 1 to n 

rriij := m V (■ m ik A m kj ) 
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Examples: 

1. Some common relations and whether they have certain properties are given in the 
following table: 


set 

relation 

reflexive symmetric antisymmetric transitive 

any nonempty set 

= 

yes 

yes 

yes 

yes 

any nonempty set 

7 ^ 

no 

yes 

no 

no 

71 

< (or >) 

yes 

no 

yes 

yes 

71 

< (or >) 

no 

no 

yes 

yes 

positive integers 

is a divisor of 

yes 

no 

yes 

yes 

nonzero integers 

is a divisor of 

yes 

no 

no 

yes 

integers 

congruence mod n 

yes 

yes 

no 

yes 

any set of sets 

C (or D) 

yes 

no 

yes 

yes 

any set of sets 

C (or D) 

no 

no 

yes 

yes 


2 . If A is any set, the universal relation is the relation R on A x A such that aRb for 
all a,b £ A; i.e., R = A x A 

3 . If A is any set, the empty relation is the relation R on Ax A where aRb is never 
true; i.e., A = 0 . 

4 . If A is any set, the relation R on A where aRb if any only if a = b is the identity 
(or diagonal ) relation I = I a = { (a, a) \ a £ A}, which is also written A or A , 4 . 

5. Every function f:A — > B induces a binary relation Rf from A to B under the 
rule aRfb if and only if /(a) = b. 

6 . For A = {2,3,4,6,12}, suppose that aRb means that a is a divisor of b. Then R 
can be represented by the set 

{(2,2), (2,4), (2,6), (2,12), (3,3), (3,6), (3,12), (4,4), (4,12), ( 6 , 6 ), (6,12), (12,12)}. 

The relation R can also be represented by the digraph with the following adjacency 
matrix 

/I 0 1 1 1 \ 

0 10 11 

0 0 10 1 . 

0 0 0 1 1 

Vo 0 0 0 1 / 

7 . The transitive closure of the relation {(1, 3), (2, 3), (3, 2)} on {1,2, 3} is the relation 
{(1,2), (1,3), (2, 2), (2, 3), (3, 2), (3, 3)}. 

8 . The transitive closure of the relation R = {(1, 2), (2, 3), (3, 1)} on {1,2,3} is the 
universal relation {1,2,3} x {1,2,3}. A transitive reduction of R is the relation given 
by {(1, 3), (3, 2), (2, 1)}. This shows that a transitive reduction may contain pairs that 
are not in the original relation. 

9. If R = { (a, b) | aRb for all a, b € {1, 2, 3} }, then the relations {(1, 2), (2, 3), (3, 1)} 
and {(1, 3), (3, 2), (2, 1)} are both transitive reductions for R. Thus, transitive reductions 
are not unique. 
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1.4.2 EQUIVALENCE RELATIONS 

Equivalence relations are binary relations that describe various types of similarity or 
“equality” among elements in a set. The elements that look alike or behave in a similar 
way are grouped together in equivalence classes, resulting in a partition of the set. 
Any element chosen from an equivalence class essentially “mirrors” the behavior of all 
elements in that class. 

Definitions: 

An equivalence relation on A is a binary relation on A that is reflexive, symmetric, 
and transitive. 

If R is an equivalence relation on A, the equivalence class of a £ A is the set R[a] = 
{ b £ A | aRb}. When it is clear from context which equivalence relation is intended, 
the notation for the induced equivalence class can be abbreviated [a]. 

The induced partition on a set A under an equivalence relation R is the set of equiv- 
alence classes. 

Facts: 

1. A nonempty relation R is an equivalence relation if and only if R o R^ 1 = R. 

2. The induced partition on a set A actually is a partition of A ; i.e. , the equivalence 
classes are all nonempty, every element of A lies in some equivalence class, and two 
classes [a] and [b] are either disjoint or equal. 

3 . There is a one-to-one correspondence between the set of all possible equivalence 
relations on a set A and the set of all possible partitions of A. (Fact 2 shows how to 
obtain a partition from an equivalence relation. To obtain an equivalence relation from 
a partition of A , define R by the rule aRb if and only if a and b lie in the same element 
of the partition.) 

4. For any set A, the coarsest partition (with only one set in the partition) of A is 
induced by the equivalence relation in which every pair of elements are related. The 
finest partition (with each set in the partition having cardinality 1) of A is induced by 
the equivalence relation in which no two different elements are related. 

5. The set of all partitions of a set A is partially ordered under refinement (§1.2.2 and 
§1.4.3). This partial ordering is a lattice (§5.7). 

6. To find the smallest equivalence relation containing a given relation, first take the 
transitive closure of the relation, then take the reflexive closure of that relation, and 
finally take the symmetric closure. 

Examples: 

1. For any function f:A—*B, define the relation a±Ra 2 to mean that f(ai) = /( 0 . 2 ). 
Then R is an equivalence relation. Each induced equivalence class is the inverse image 
/ _1 (6) of some b £ B. 

2 . Write a = b (mod n) (“a is congruent to b modulo n”) when a, b and n > 0 are 
integers such that n \ b — a (n divides b — a). Congruence mod n is an equivalence 
relation on the integers. 

3 . The equivalence relation of congruence modulo n on the integers Z yields a partition 
with n equivalence classes: [0] = { kn \ k £ Z }, [1] = { 1 + kn \ k £ Z }, [2] = { 2 + kn \ 
k £ Z }, . . . , [n — 1] = { (n — 1) + kn \ k £ Z }. 
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4. The isomorphism relation on any set of groups is an equivalence relation. (The same 
result holds for rings, fields, etc.) (See Chapter 5.) 

5. The congruence relation for geometric objects in the plane is an equivalence relation. 

6 . The similarity relation for geometric objects in the plane is an equivalence relation. 


1 .4.3 PARTIALLY ORDERED SETS 

Partial orderings extend the relationship of < on real numbers and allow a comparison 
of the relative “size” of elements in various sets. They are developed in greater detail 
in Chapter 11. 

Definitions: 

A preorder on a set S' is a binary relation < on S that has the following properties for 
all a,b,c £ S: 

• reflexive: a < a 

• transitive: if a < b and b < c, then a < c. 

A partial ordering (or partial order ) on a set S is a binary relation < on S that 
has the following properties for all a, 6, c € S: 

• reflexive: a < a 

• antisymmetric: if a < b and b < a, then a = b 

• transitive: if a < b and b < c, then a < c. 

Notes: The expression c > b means that b < c. The symbols A and S are often used 
in place of < and >. The expression a < b (or b > a) means that a < b and a ^ b. 

A partially ordered set (or poset) is a set with a partial ordering defined on it. 

A directed ordering on a set S is a partial ordering that also satisfies the following 
property: if a, b £ S, then there is a c € S such that a < c and b < c. 

Note: Some authors do not require that antisymmetry hold in the definition of directed 
ordering. 

Two elements a and b in a poset are comparable if either a < b or b < a. Otherwise, 
they are incomparable. 

A totally ordered (or linearly ordered) set is a poset in which every pair of elements 
are comparable. 

A chain is a subset of a poset in which every pair of elements are comparable. 

An antichain is a subset of a poset in which no two distinct elements are comparable. 
An interval in a poset (S, <) is a subset [a, b] = { x \ x £ S, a < x < b }. 

An element b in a poset is minimal if there exists no element c such that c < b. 

An element b in a poset is maximal if there exists no element c such that c> b. 

An element b in a poset S' is a maximum element (or greatest element) if every 
element c satisfies the relation c < b. 

An element b in a poset S is a minimum element (or least element) if every element 
c satisfies the relation c > b. 
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A well-ordered set is a poset ( S , <) in which every nonempty subset contains a mini- 
mum element. 

An element 6 in a poset S is an upper bound for a subset U C S if every element c 
of U satisfies the relation c < 6. 

An element b in a poset S' is a lower bound for a subset U C S if every element c of U 
satisfies the relation c > b. 

A feast upper bound for a subset U of a poset S is an upper bound b such that if c 
is any other upper bound for U then c > b. 

A greatest lower bound for a subset U of a poset S is a lower bound b such that if c 
is any other lower bound for U then c < 6. 

A lattice is a poset in which every pair of elements, x and y , have both a least upper 
bound lub(:r,y) and a greatest lower bound glb(x, j/) (§5.7). 

The Cartesian product of two posets (Si, < x ) and (S2, < 2 ) is the poset with domain 
Si x S2 and relation < x x < 2 given by the rule (ai,a2 )< 1 x < 2 (61,62) if and only if 
ai < x 61 and 02 < 2 62. 

The element c covers another element 6 in a poset if 6 < c and there is no element d 
such that 6 < d < c. 

A Hasse diagram ( cover diagram ) for a poset (S, <) is a directed graph (§11.8) 
whose vertices are the elements of S such that there is an arc from 6 to c if c covers 6, 
all arcs are directed upward on the page when drawing the diagram, and arrows on the 
arcs are omitted. 

Facts: 

1. R is a partial order on a set S if and only if i? -1 is a partial order on S. 

2. The only partial order that is also an equivalence relation is the relation of equality. 

3. The Cartesian product of two posets, each with at least two elements, is not totally 
ordered. 

4. In the Hasse diagram for a poset, there is a path from vertex 6 to vertex c if and 
only if 6 < c. (When 6 = c, it is the path of length 0.) 

5. Least upper bounds and greatest lower bounds are unique, if they exist. 

Examples: 

1. The positive integers are partially ordered under the relation of divisibility, in which 
6 < c means that 6 divides c. In fact, they form a lattice (§5.7.1), called the divisibility 
lattice. The least upper bound of two numbers is their least common multiple, and the 
greatest lower bound is their greatest common divisor. 

2. The set of all powers of two (or of any other positive integer) forms a chain in the 
divisibility lattice. 

3. The set of all primes forms an antichain in the divisibility lattice. 

4. The set 1Z of real numbers with the usual definition of < is a totally ordered set. 

5. The set of all logical propositions on a fixed set of logical variables p, q, r, . . . is 
partially ordered under inverse implication, so that B < A means that A — > B is a 
tautology. 

6. The complex numbers, ordered under magnitude, do not form a poset, because they 
do not satisfy the axiom of antisymmetry. 
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7. The set of all subsets of any set forms a lattice under the relation of subset inclusion. 
The least upper bound of two subsets is their union, and the greatest lower bound is 
their intersection. Part (a) in the following figure gives the Hasse diagram for the lattice 
of all subsets of {a, b, c}. 

8. Part (b) of the following figure shows the Hasse diagram for the lattice of all positive 
integer divisors of 12. 

9. Part (c) of the following figure shows the Hasse diagram for the set {1, 2, 3, 4, 5, 6} 
under divisibility. 

10 . Part (d) of the following figure shows the Hasse diagram for the set {1, 2, 3, 4} with 
the usual definition of <. 



(a) (b) (c) (d) 

11. Multilevel security policy: The flow of information is often restricted by using se- 
curity clearances. Documents are put into security classes, (L, C), where L is an element 
of a totally ordered set of authority levels (such as “unclassified”, “confidential”, “se- 
cret”, “top secret”) and C is a subset (called a “compartment”) of a set of subject areas. 
The subject areas might consist of topics such as agriculture, Eastern Europe, economy, 
crime, and trade. A document on how trade affects the economic structure of Eastern 
Europe might be assigned to the compartment {trade, economy, Eastern Europe}. The 
set of security classes is made into a lattice by the rule: {L\,C\) < (L 2 , C 2 ) if and only 
if L\ < L 2 and C\ C C 2 . Information is allowed to flow from class to class 

(. L 2 , C 2 ) if and only if (L 1; C±) < ( L 2 , C 2 ). For example, a document with security class 
(secret, {trade, economy}) flows to both (top secret, {trade, economy}) and (secret, 
{trade, economy, Eastern Europe}), but not vice versa. This set of security classes 
forms a lattice (§5.7.1). 


1.4.4 n-ARY RELATIONS 
Definitions: 

An n-ary relation on sets Ai, A 2i A n is any subset R of A\ x A 2 x ■ ■ ■ x A n . 

The sets A t are called the domains of the relation and the number n is called the 
degree of the relation. 

A primary key of an n-ary relation R on A\ x A 2 x ■ ■ ■ x A n is a domain A t such that 
each at € Ai is the *th coordinate of at most one n-tuple in R. 
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A composite key of an n-ary relation R on Ai x A 2 x • • • x A n is a product of domains 
A i± x A i2 x • • • x A im such that for each m-tuple (a^a^, . . . , aj m ) G A ix x A i2 x • • • x 
A; m , there is at most one n-tuple in R that matches (a n , a,; 2 , . . . , ai m ) in coordinates 

^ 1 5 ^ 2 ) • • • ? 

The projection function Pi lt i 2 ,...,i k : di x A x x A„ — > A^ x A ; 2 x • • • x Aj fc is 
given by the rule 

Pi\ ,&2 (^1 1 ^2 1 ■ • • 5 ^n) (®ii i U^ 2 , • • ■ , CLi k ) ■ 

That is, Pi lt i selects the elements in coordinate positions A U 2 > •••,*& from the 
n-tuple (ai, 02 , . . . , a n ). 

The join J k (R,S) of an ?n-ary relation i? and an n-ary relation S, where k < to and 
fc < n, is a relation of degree to + n — k such that 

(01 , . . . , , Cl , . . . , C/j, 61 , . . . , 6 n _l;) G j k, (P'1 S') 

if and only if 

(ai 5 • • • 5 Q'm—ki Cl, . . . ,c fc ) G i? and (ci, ...,c k ,bi,.. .,b n - k ) G 5. 

Facts: 

1. An n-ary relation on sets Ai,A 2 ,...,A„ can be regarded as a function R from 
A 1 x A 2 x ■ ■ ■ x A n to the Boolean domain {TRUE, FALSE}, where (ai, a 2 , . . . , a n ) G R 
if and only if R(a\, a 2 , . . . , a n ) = TRUE. 

2. n-ary relations are essential models in the construction of database systems. 

Examples: 

1. Let A\ be the set of all men and A 2 the set of all women, in a nonpolygamous 
society. Let mRw mean that m and w are presently married. Then each of Ai and A 2 
is a primary key. 

2. Let Ai be the set of all telephone numbers and A 2 the set of all persons. Let nRp 
mean that telephone number n belongs to person p. Then Ai is a primary key if each 
number is assigned to at most one person, and A 2 is a primary key if each person has 
at most one phone number. 

3. In a conventional telephone directory, the name and address domains can form a 
composite key, unless there are two persons with the same name (no distinguishing 
middle initial or suffix such as “Jr.”) at the same address. 

4. Let A = B = C = Z, and let R be the relation on Ax B x C such that (a, b,c) G R 
if and only if a + b = c. The set Ax B is a composite key. There is no primary key. 

5. Let A = all students at a certain college, B = all student ID numbers being used at 
the college, C = all major programs at the college. Suppose a relation R is defined on 
Ax B x C by the rule (a, b,c) G R means student a with ID number b has major c. If 
each student has exactly one major and if there is a one-to-one correspondence between 
students and ID numbers, then A and B are each primary keys. 

6 . Let A = all employee names at a certain corporation, B = all Social Security 
numbers, C = all departments, D = all job titles, E = all salary amounts, and F = 
all calendar dates. On A x B x C x D x E x F x F let R be the relation such that 
(a, b, c, d , e, f,g)£R means employee named a with Social Security number b works in 
department c, has job title d, earns an annual salary e, was hired on date /, and had the 
most recent performance review on date g. The projection P 15 (projection onto Ax E) 
gives a list of employees and their salaries. 


© 2000 by CRC Press LLC 



1.5 


PROOF TECHNIQUES 


A proof is a derivation of new facts from old ones. A proof makes possible the derivation 
of properties of a mathematical model from its definition, or the drawing of scientific 
inferences based on data that have been gathered. Axioms and postulates capture 
all basic truths used to develop a theory. Constructing proofs is one of the principal 
activities of mathematicians. 

Furthermore, proofs play an important role in computer science — in such areas 
as verification of the correctness of computer programs, verification of communications 
protocols, automatic reasoning systems, and logic programming. 


1 .5.1 RULES OF INFERENCE 
Definitions: 

A proposition is a declarative sentence that is unambiguously either true or false. 
(See §1.1.1.) 

A theorem is a proposition derived as the conclusion of a valid proof from axioms and 
definitions. 

A lemma is a theorem that is an intermediate step in the proof of a more important 
theorem. 

A corollary is a theorem that is derived as an easy consequence of another theorem. 

A statement form is a declarative sentence containing some variables and logical 
symbols, such that the sentence becomes a proposition if concrete values are substituted 
for all the free variables. 

An argument form is a sequence of statement forms. 

The final statement form in an argument form is called the conclusion (of the argu- 
ment). The conclusion is often preceded by the word “therefore” (symbolized .-. ). 

The statement forms preceding the conclusion in an argument form are called premises 
(of the argument). 

If concrete values are substituted for the free variables of an argument form, an argu- 
ment of that form is obtained. 

An instantiation of an argument is the substitution of concrete values into all free 
variables of the premises and conclusion. 

A valid argument form is an argument form such that in every instantiation in which 
all the premises are true, the conclusion is also true. 

A rule of inference is an alternative name for a valid argument form, which is used 
when the form is frequently applied. 


Facts: 

1. Substitution rule: Any variable occurring in an argument may be replaced by an 
expression of the same type without affecting the validity of the argument, as long as 
the replacement is made everywhere the variable occurs. 
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2 . The following table gives rules of inference for arguments with compound statements. 


argument 

name r 

form 

argument 

name % 

form 

Modus ponens p — i • q 

(method of affirming) p 

■■■ q 

Modus tollens p — > q 

(method of denying) ->q 

••• ->P 

Hypothetical p — > q 

syllogism q—^r 

p — > r 

Disjunctive pV q 

syllogism 

q 

Disjunctive p 

addition p V q 

Dilemma by p V q 

cases p — > r 

q — > r 
r 

Constructive pV r 

dilemma p — > q 

r — > s 

g V s 

Destructive ~^q V ~>s 

dilemma p — ► q 

r — > s 

.•. ->p V -i r 

Conjunctive p 

addition q 

p f\q 

Conditional p 

proof p A q — > r 

q — » r 

Conjunctive p A q 

simplification p 

Rule of given contra- 

contradiction diction c 

-i p — > c 

P 


3 . The following table gives rules of inference for arguments with quantifiers. 


name 

argument form 

Universal instantiation 

(\/x G D) Q(x) 

Q(a) (a any particular element of D) 

Generalizing from the 
generic particular 

Q(a) (a an arbitrarily chosen element of D) 
(Vx G D) Q(x) 

Existential specification 

(3x G D) Q(x) 

Q(a) (for at least one a G D) 

Existential generalization 

Q(a) (for at least one element a G D) 

(3x G D) Q(x) 


4 . Substituting R(x) — > S(x) in place of Q(x ) and 3 in place of x in generalizing from 
the generic particular gives the following inferential rule: 

Universal modus R(a) — > S(a ) for any particular but arbitrarily chosen a G D 

ponens: (Vz G D) [ R(z ) — > £( 2 )]. 

5 . The rule of generalizing from the generic particular determines the outline of most 
mathematical proofs. 

6. The rule of existential specification is used in deductive reasoning to give names to 
quantities that are known to exist but whose exact values are unknown. 
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7. A useful strategy for determining whether a statement is true is to first try to prove 
it using a variety of approaches and proof methods. If this is unsuccessful, the next 
step may be to try to disprove the statement, such as by trying to construct or prove 
the existence of a counterexample. If this does not work, the next step is to try to 
prove the statement again, and so on. This is one of the many ways in which many 
mathematicians attempt to develop new results. 


Examples: 


1. Suppose that D is the set of all objects in the physical universe, P(x) is “x is a 
human being”, Q(x) is “x is mortal”, and a is the Greek philosopher Socrates. 
argument form an argument of that form 


(Vx G D) [P{x) -> Q{x)\ 
P(a ) (for particular a G D) 

’• Q(a) 


V objects x, (x is a human being) — » (x is mortal). 

( informally : All human beings are mortal.) 
Socrates is a human being. 

Socrates is mortal. 


2. The argument form shown below is invalid: there is an argument of this form (shown 
next to it) that has true premises and a false conclusion. 

argument form an argument of tha t form 

(Vx G D ) [P(x) — > Q{x)} V objects x, (x is a human being) — » (x is mortal). 

( informally : All human beings are mortal.) 
Q(a) (for particular a G D) My cat Bunbury is mortal. 

.■. P(a) My cat Bunbury is a human being. 


In this example, D is the set of all objects in the physical universe, P(x) is “x is a 
human being”, Q(x) is “x is mortal”, and a is my cat Bunbury. 


3. The distributive law for real numbers, (Va, 6, c G lZ)[ac + bc = (a + 6)c], implies that 
2\/2 + 3\/2 = (2 + 3)\/2 (because 2, 3, and \f2 are particular real numbers). 


4. Since 2 is a prime number that is not odd, the rule of existential generalization 
implies the truth of the statement “3 a prime number n such that n is not odd” . 


5. To prove that the square of every even integer is even, by the rule of generalizing 
from the generic particular, begin by supposing that n is any particular but arbitrarily 
chosen even integer. The job of the proof is to deduce that n 2 is even. 

6 . By definition, every even integer equals twice some integer. So if at some stage 
of a reasoning process there is a particular even integer n, it follows from the rule of 
existential specification that n = 2 k for some integer k (even though the numerical 
values of n and k may be unknown). 


1.5.2 PROOFS 

Definitions: 

A ( logical ) proof of a statement is a finite sequence of statements (called the steps of 
the proof) leading from a set of premises to the given statement. Each step of the proof 
must either be a premise or follow from some previous steps by a valid rule of inference. 

In a mathematical proof , the set of premises may contain any item of previously 
proved or agreed upon mathematical knowledge (definitions, axioms, theorems, etc.) as 
well as the specific hypotheses of the statement to be proved. 

A direct proof of a statement of the form p — > q is a proof that assumes p to be true 
and then shows that q is true. 
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An indirect proof of a statement of the form p — > q is a proof that assumes that -iq 
is true and then shows that ->p is true. That is, a proof of this form is a direct proof of 
the contrapositive ~^q — > -<p. 

A proof by contradiction assumes the negation of the statement to be proved and 
shows that this leads to a contradiction. 

Facts: 

1. A useful strategy to determine if a statement of the form (Vx £ D) [P(x) — > Q(x)\ 
is true or false is to imagine an element x £ D that satisfies P(x) and, using this 
assumption (and other facts) , investigate whether x must also satisfy Q(x). If the answer 
for all such x is “yes”, the given statement is true and the result of the investigation is 
a direct proof. If it is possible to find an x £ D for which Q(x) is false, the statement 
is false and this value of x is a counterexample. If the investigation shows that is not 
possible to find an x £ D for which Q(x) is false, the given statement is true and the 
result of the investigation is a proof by contradiction. 

2. There are many types of techniques that can be used to prove theorems. Table 2 
describes how to approach proofs of various types of statements. 

Examples: 

1. In the following direct proof (see Table 1, item 2), the domain D is the set of all 
pairs of integers, x is ( m,n ), and the predicate P(m,n) is “if to and n are even, then 
m + n is even” . 

Theorem: For all integers m and n , if m and n are even, then m + n is even. 
Proof: Suppose m and n are arbitrarily chosen even integers, [m + n must be 
shown to be even.] 

1. .-. m = 2 r, n = 2s for some integers r and s (by definition of even) 

2. .-. m + n = 2r + 2s (by substitution) 

3. .-. to + n = 2 (r + s ) (by factoring out the 2) 

4. r + s is an integer (it is a sum of two integers) 

5. .-. to + n is even (by definition of even) 

The following partial expansion of the proof shows how some of the steps are justified 
by rules of inference combined with previous mathematical knowledge: 

1. Every even integer equals twice some integer: 

[V even x £ Z (x = 2 y for some y £ Z)\ 
to is a particular even integer. 

.-. to = 2r for some integer r. 

3. Every integer is a real number: [Vn £ Z (n £ 7V)\ 

(V integer n, n is a real number.) 
r and s are particular integers. 

.-. r and s are real numbers. 

The distributive law holds for real numbers: [Va, 6, c £ TZ ( ab + ac = a{b + c))] 

2, r, and s are particular real numbers. 

.-. 2r + 2s = 2 (r + s). 

4. Any sum of two integers is an integer: [Vm, n £ Z (to + n £ Z)\ 
r and s are particular integers. 

.-. r + s is an integer. 
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Table 1 Techniques of proof. 


statement 

technique of proof 


p q 

Direct proof : Assume that p is true. Use rules of inference 
and previously accepted axioms, definitions, theorems, and 
facts to deduce that q is true. 

(Var 

e D)P(x) 

Direct proof: Suppose that x is an arbitrary element of 
D. Use rules of inference and previously accepted axioms, 
definitions, and facts to deduce that P(x) is true. 

( 3 ® 

e D)P(x) 

Constructive direct proof : Use rules of inference and pre- 
viously accepted axioms, definitions, and facts to actually 
find an x G D for which P(x) is true. 

Nonconstructive direct proof: Deduce the existence of x 
from other mathematical facts without a description of how 
to compute it. 

{\/x£D)( 3 y£E)P(x,y) 

Constructive direct proof : Assume that x is an arbitrary 
element of D. Use rules of inference and previously accepted 
axioms, definitions, and facts to show the existence of a 
y G E for which P{x , y) is true, in such a way that y can 
be computed as a function of x. 

Nonconstructive direct proof: Assume x is an arbitrary 
element of D. Deduce the existence of y from other math- 
ematical facts without a description of how to compute it. 


p^ q 

Proof by cases: Suppose p = ppj ■ ■ • Vp*,. Prove that each 
conditional p.j— >q is true. The basis for division into cases 
is the logical equivalence [(piV • • • Vp*,)— >q] = [(pi— >q) A 
•••A (pk^q)]- 


p-> q 

Indirect proof or Proof by contraposition: Assume that 
~^q is true (that is, assume that q is false). Use rules of 
inference and previously accepted axioms, definitions, and 
facts to show that ->p is true (that is, p is false). 


p-> q 

Proof by contradiction: Assume that p — > q is false (that is, 
assume that p is true and q is false). Use rules of inference 
and previously accepted axioms, definitions, and facts to 
show that a contradiction results. This means that p — > q 
cannot be false, and hence must be true. 

( 3 x 

G D)P(x) 

Proof by contradiction : Assume that there is no x G D for 
which P(x) is true. Show that a contradiction results. 

(Vie 

G D)P(x) 

Proof by contradiction: Assume that there is some x G D 
for which P(x) is false. Show that a contradiction results. 

P- 

->(qVr) 

Proof of a disjunction: Prove that one of its logical equiv- 
alences (p A -iq) — > r or (p A ->r) — > q is true. 

Pi, 

...,p k are 

Proof by cycle of implications: Prove pi — > P2, P2 — ► P3, 

equivalent 

..., pk-i — > Pki Pk — > Pi- This is equivalent to proving 
(pi — > P2) A (p 2 — ► p 3 ) A • • • A (p fc -i -> Pk) A (p fc -> pi). 
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5. Any integer that equals twice some integer is even: [Vx £ Z (if x = 2 y for 
some y £ Z, then x is even.)] 

2 (r + s ) equals twice the integer r + s. 

2 (r + s) is even. 

2. A constructive existence proof : 

Theorem: Given any integer n, there is an integer m with m > n. 

Proof: Suppose that n is an integer. Let m = n+ 1. Then m is an integer and 
m > n. 

The proof is constructive because it established the existence of the desired integer m 
by showing that its value can be computed by adding 1 to the value of n. 

3. A Nonconstructive existence proof: 

Theorem: Given a nonnegative integer n, there is always a prime number p 
that is greater than n. 

Proof: Suppose that n is a nonnegative integer. Consider n! + 1. Then n! + 1 
is divisible by some prime number p because every integer greater than 1 is 
divisible by a prime number, and n! + 1 > 1. Also, p > n because when n! + 1 
is divided by any positive integer less than or equal to n, the remainder is 1 
(since any such number is a factor of nl). 

The proof is a nonconstructive existence proof because it demonstrated the existence of 
the number p, but it offered no computational rule for finding it. 

4. A proof by cases: 

Theorem: For all odd integers n, the number n 2 — 1 is divisible by 8. 

Proof: Suppose n is an odd integer. When n is divided by 4, the remainder is 
0, 1, 2, or 3. Hence n has one of the four forms 4 fc, 4 k + 1,4 /c + 2, or 4k + 3 
for some integer k. But n is odd. So n yl 4 k and n yl 4 k + 2. Thus either 
n = 4/c + 1 or n = 4/c + 3 for some integer k. 

Case 1 [n = 4/c + 1 for some integer k\: In this case n 2 — 1 = (4 k + l) 2 — 1 = 

16/c 2 + 8/c + 1 — 1 = 16/c 2 + 8 k = 8(2 k 2 + k), which is divisible by 8 because 
2 k 2 + k is an integer. 

Case 2 [n = 4/c + 3 for some integer k]: In this case n 2 — 1 = (4 k + 3) 2 — 1 = 

16/c 2 + 24 k + 9 — 1 = 16A: 2 + 24/c + 8 = 8(2 k 2 + 3 k + 1), which is divisible by 8 
because 2 k 2 + 3/c + 1 is an integer. 

So in either case n 2 — 1 is divisible by 8, and thus the given statement is proved. 

5. A proof by contraposition: 

Theorem: For all integers n, if n 2 is even, then n is even. 

Proof: Suppose that n is an integer that is not even. Then when n is divided 
by 2 the remainder is 1, or, equivalently, n = 2k + 1 for some integer k. By 
substitution, n 2 = (2 k + l) 2 = 4/c 2 + 4/c + 1 = 2(2 k 2 + 2k) + 1. It follows that 
when n 2 is divided by 2 the remainder is 1 (because 2 k 2 + 2k is an integer). 
Thus, n 2 is not even. 

In this proof by contraposition, a direct proof of the contrapositive “if n is not even, 
then n 2 is not even” was given. 
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6 . A proof by contradiction : 

Theorem : \/2 is irrational. 

Proof: Suppose not; that is, suppose that \/2 were a rational number. By 
definition of rational, there would exist integers a and b such that \/2 = |, or, 
equivalently, 2 b 2 = a 2 . Now the prime factorization of the left-hand side of this 
equation contains an odd number of factors and that of the right-hand side 
contains an even number of factors (because every prime factor in an integer 
occurs twice in the prime factorization of the square of that integer) . But this 
is impossible because the prime factorization of every integer is unique. This 
yields a contradiction, which shows that the original supposition was false. 
Hence \/2 is irrational. 

7. A proof by cycle of implications : 

Theorem: For all positive integers a and b, the following statements are equiv- 
alent: 

(1) a is a divisor of 6; 

(2) the greatest common divisor of a and b is a; 

(3) L!J = !• 

Proof : Let a and b be positive integers. 

(1) — > (2): Suppose that a is a divisor of b. Since a is also a divisor of a, a is a 
common divisor of a and b. But no integer greater than a is a divisor of a. So 
the greatest common divisor of a and b is a. 

(2) — > (3): Suppose that the greatest common divisor of a and b is a. Then 
a is a divisor of both a and b, so b = ak for some integer k. Then ^ = k, an 
integer, and so by definition of floor, [^J = k = 

(3) — > (1): Suppose that [|J = K Let k = |_^J . Then k = |_|J = and k 
is an integer by definition of floor. Multiplying the outer parts of the equality 
by a gives b = ak, so by definition of divisibility, a is a divisor of b. 

8 . A proof of a disjunction: 

Theorem: For all integers a and p, if p is prime, then either p is a divisor of a, 
or a and p have no common factor greater than 1. 

Proof: Suppose a and p are integers and p is prime, but p is not a divisor of a. 
Since p is prime, its only positive divisors are 1 and p. So, since p is not a 
divisor of a, the only possible positive common divisor of a and p is 1. Hence a 
and p have no common divisor greater than 1. 


1.5.3 DISPROOFS 

Definitions: 

A disproof of a statement is a proof that the statement is false. 

A counterexample to a statement of the form (Vx £ D)P(x) is an element b £ D for 
which P(b) is false. 
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Facts: 

1. The method of disproof by counterexample is based on the following fact: 

->[(Vx £ D ) P(x)] <£> (3x £ D) [-iP(x)]. 

2. The following table describes how to give various types of disproofs: 


statement 

technique of disproof 

(Vx€D)P(x) 

Constructive disproof by counterexample: Exhibit a spe- 
cific a £ D for which P(a) is false. 

(Vx€D)P(x) 

Existence disproof : Prove the existence of some a £ D for 
which P(a) is false. 

(3 x€D)P(x) 

Prove that there is no a £ D for which P(a) is true. 

(Vx£D) [P(x] Q(x)\ 

Find an element a £ D with P(a) true and Q(a ) false. 

(Vx£D)(3y£E) P(x, y) 

Find an element a £ D with P(a , y) false for every y £ E. 

(3 x£D)(Vy£E) P(x, y) 

Prove that there is no a £ D for which P(a,y) is true for 
every possible a £ E. 


Examples: 

1 . The statement (Va, b £ TZ) [o 2 < b 2 — > a < b] is disproved by the following coun- 
terexample: a = 2, b = —3. Then a 2 < b 2 (because 4 < 9) but a ft b (because 2 ^ —3). 

2. The statement “every prime number is odd” is disproved by the following coun- 
terexample: n = 2, since n is prime and not odd. 


1.5.4 MATHEMATICAL INDUCTION 
Definitions: 

The principle of mathematical induction ( weak form) is the following rule of 
inference for proving that all the items in a list xo, Xi, X 2 , • • . have some property P(x): 

P(x o) is true basis premise 

(Vk > 0) [if P{xk) is true, then P(xfc+i) is true] induction premise 
.-. (Vn > 0) [P(x n ) is true]. conclusion 

The antecedent P(xfc) in the induction premise “if P(x*) is true, then P(xk+i) is true” 
is called the induction hypothesis. 

The basis step of a proof by mathematical induction is a proof of the basis premise. 

The induction step of a proof by mathematical induction is a proof of the induction 
premise. 

The principle of mathematical induction ( strong form ) is the following rule of 
inference for proving that all the items in a list Xq, x±, X 2 , . . . have some property P(x): 

P(x o) is true basis premise 

(Vfe > 0) [if P(x 0 ),P(xi), . . . , P(xfc) are all (strong) induction premise 
true, then P(xfc+i) is true] 

.-. (Vn > 0) [P(x n ) is true]. conclusion 
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The well-ordering principle for the integers is the following axiom: If 5 is a 
nonempty set of integers such that every element of S is greater than some fixed integer, 
then S contains a least element. 

Facts: 

1. Typically, the principle of mathematical induction is used to prove that one of the 

following sequences of statements is true: P(0), P(l), P(2), ... or P(l), P(2), P(3), 

In these cases the principle of mathematical induction has the form: if P(0) is true and 
P(n) — > P(n + 1) is true for all n > 0, then P(n) is true for all n > 0; or if P(l) is true 
and P(n) — > P(n + 1) is true for all n > 1, then P(n) is true for all n > 1 

2. If the truth of P(n + 1) can be obtained from the previous statement P(n), the weak 
form of the principle of mathematical induction can be used. If the truth of P(n + 1) 
requires the use of one or more statements P(k) for k < n, then the strong form should 
be used. 

3. Mathematical induction can also be used to prove statements that can be phrased 
in the form “For all integers n > k, P(ri) is true”. 

4. Mathematical induction can often be used to prove summation formulas and in- 
equalities. 

5. There are alternative forms of mathematical induction, such as the following: 

• if P(0) and P(l) are true, and if P(n) — > P(n + 2) is true for all n > 0, then P(n) 

is true for all n > 0; 

• if P(0) and P(l) are true, and if [P(n) A P(n + 1)] — » P(n + 2) is true for all 

n > 0, then P(n) is true for all n > 0. 

6. The weak form of the principle of mathematical induction, the strong form of the 
principle of mathematical induction, and the well-ordering principle for the integers are 
all regarded as axioms for the integers. This is because they cannot be derived from the 
usual simpler axioms used in the definition of the integers. (See the Peano definition of 
the natural numbers in §1.2.3.) 

7. The weak form of the principle of mathematical induction, the strong form of the 
principle of mathematical induction, and the well-ordering principle for the integers are 
all equivalent. In other words, each of them can be proved from each of the others. 

8. The earliest recorded use of mathematical induction occurs in 1575 in the book 
Arithmeticorum Libri Duo by Francesco Maurolico, who used the principle to prove 
that the sum of the first n odd positive integers is n 2 . 

Examples: 

1. A proof using the weak form of mathematical induction: (In this proof, xq, X\, X 2 , ■ ■ ■ 
is the sequence 1,2,3,..., and the property P{x n ) is the equation 1 + 2 + • • • + n = 

n(n+ 1) \ 

2 ■> 

Theorem : For all integers n > 1, 1 + 2 + • • • + n = n(n +' 1 ') _ 

Proof: 

Basis Step: For n = 1 the left-hand side of the formula is 1, and the 
right-hand side is , which is also equal to 1. Hence P(l) is true. 

Induction Step: Let k be an integer, k > 1, and suppose that P(k) is true. 

That is, suppose that 1 + 2+ • ■ - + k = (the induction hypothesis) is true. 

It must be shown that P(k + 1) is true: 1 + 2-1 |-(fc + l) = P’+ 1 H++ 1 )+ 1 ) ^ 
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or, equivalently, that 1 + 2 + • • • + (k + 1) = ( fc+1 K fc + 2 ) _ But, by substitution 
from the induction hypothesis, 

1 + 2 + • • • + (k + 1) = (1 + 2 + • • • + k) + {k + 1) 

= + (fc+l) 

_ (fc+l)(fc+2) 

2 

Thus, l + 2 + -- - + (fc + l)= l fc+1 K fc + 2 ) i s true. 

2. A proof using the weak form of mathematical induction: 

Theorem: For all integers n > 4, 2 n < n\. 

Proof: 

Basis Step: For n = 4, 2 4 < 4! is true since 16 < 24. 

Induction Step: Let k be an integer, k > 4, and suppose that 2 fc < k\ is 
true. The following shows that 2 fe+1 < (k + 1)! must also be true: 

2 fc+ 1 = 2 • 2 k < 2 • k\ < {k + 1 )k\ = (k + 1)!. 

3. A proof using the weak form of mathematical induction: 

Theorem: For all integers n > 8, n cents in postage can be made using only 
3-cent and 5-cent stamps. 

Proof: Let P(n) be the predicate “n cents postage can be made using only 
3-cent and 5-cent stamps”. 

Basis Step: P( 8) is true since 8 cents in postage can be made using one 
3-cent stamp and one 5-cent stamp. 

Induction Step: Let k be an integer, k > 8, and suppose that P{k) is true. 
The following shows that P(k + 1) must also be true. If the pile of stamps 
for k cents postage has in it any 5-cent stamps, then remove one 5-cent stamp 
and replace it with two 3-cent stamps. If the pile for k cents postage has only 
3-cent stamps, there must be at least three 3-cent stamps in the pile (since 
k ^ 3 or 6). Remove three 3-cent stamps and replace them with two 5-cent 
stamps. In either case, a pile of stamps for k + 1 cents postage results. 

4. A proof using an alternative form of mathematical induction (Fact 5): 

Theorem: For all integers n > 0, F n < 2". ( Fk are Fibonacci numbers. See 
§3-1.2.) 

Proof: Let P(n ) be the predicate “F n < 2" ” . 

Basis Step: P{ 0) and P(l) are both true since F 0 = 0 < 1 = 2° and 
F 1 = 1 < 2 = 2 1 . 

Induction Step: Let k be an integer, k > 0, and suppose that P(k ) and 
P(k + 1) are true. Then P(k + 2) is also true: F /.+ 2 = Fk + Fk+i < 2 fe + 2 fc+1 < 
2&+i _|_ 2&+i 2 • 2 fc+1 2 fc+2 

5. A proof using the strong form of mathematical induction: 

Theorem: Every integer n > 2 is divisible by some prime number. 

Proof: Let P(n) be the sentence “n is divisible by some prime number”. 
Basis Step: Since 2 is divisible by 2 and 2 is a prime number, P( 2) is true. 
Induction Step: Let k be an integer with k > 2, and suppose that P(i ) (the 
induction hypothesis) is true for all integers i with 2 < i < k. That is, suppose 
for all integers i with 2 < i < k that i is divisible by a prime number. ( It must 
now be shown that k is divisible by a prime number.) 
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Now either the number k is prime or k is not prime. If k is prime, then k 
is divisible by a prime number, namely itself. If k is not prime, then k = a ■ b 
where a and b are integers, with 2 < a < k and 2 < b < k. By the induction 
hypothesis, the number a is divisible by a prime number p, and so k = ab is 
also divisible by that prime p. Hence, regardless of whether k is prime or not, 
k is divisible by a prime number. 

6 . A proof using the well-ordering principle: 

Theorem: Every integer n > 2 is divisible by some prime number. 

Proof: Suppose, to the contrary, that there exists an integer n > 2 that is 
divisible by no prime number. Thus, the set S of all integers > 2 that are 
divisible by no prime number is nonempty. Of course, no number in S is 
prime, since every number is divisible by itself. 

By the well-ordering principle for the integers, the set S contains a least 
element k. Since k is not prime, there must exist integers a and b with 2 < a < 
k and 2 < b < k, such that k = a ■ b. Moreover, since k is the least element of 
the set S and since both a and b are smaller than k, it follows that neither a nor 
b is in S. Hence, the number a (in particular) must be divisible by some prime 
number p. But then, since a is a factor of k, the number k is also divisible by 
p , which contradicts the fact that k is in S. This contradiction shows that the 
original supposition is false, or, in other words, that the theorem is true. 

7. A proof using the well-ordering principle: 

Theorem: Every decreasing sequence of nonnegative integers is finite. 

Proof: Suppose ai,a 2 , ... is a decreasing sequence of nonnegative integers: 
ai > 02 > • • • . By the well-ordering principle, the set {ai,a 2 , . . .} contains a 
least element, a n . This number must be the last in the sequence (and hence 
the sequence is finite). If a n is not the last term, then a„+i < a n , which 
contradicts the fact that a n is the smallest element. 


1.5.5 DIAGONALIZATION ARGUMENTS 
Definition: 

The diagonal of an infinite list of sequences Si, S 2 , S 3 , . . . is the infinite sequence whose 
jth element is the jth entry of sequence Sj. 

A diagonalization proof is any proof that involves the diagonal of a list of sequences, 
or something analogous to this. 

Facts: 

1. A diagonalization argument can be used to prove the existence of nonrecursive func- 
tions. 

2 . A diagonalization argument can be used to prove that no computer algorithm can 
ever be developed to determine whether an arbitrary computer program given as input 
with a given set of data will terminate (the Turing Halting Problem). 

3. A diagonalization argument can be used to prove that every mathematical theory 
(under certain reasonable hypotheses) will contain statements whose truth or falsity is 
impossible to determine within the theory (Godel’s Incompleteness Theorem ). 
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Example: 


1 . A diagonalization proof : 

Theorem : The set of real numbers between 0 and 1 is uncountable. (Georg 
Cantor, 1845-1918) 

Proof: Suppose, to the contrary, that the set of real numbers between 0 and 1 
is countable. The decimal representations of these numbers can be written in 
a list as follows: 

0 . 011(112(113 . . . Oi n . . . 

0. 021022023 . • . 02n • • • 

0.O31O32O33 . . . 03 n . . . 

0.O n iO n 2On3 • ■ • O nn . . . 


From this list, construct a new decimal number 0. 616263 ■ ■ ■ b n . . . by specifying 
that 

, _ f 5 if an ^ 5 
‘ \ 6 if an = 5. 

For each integer i > 1, 0. 616263 • • • 6„ . . . differs from the ith number in the 
list in the ith decimal place, and hence 0.616263 ... b n .. . is not in the list. 
Consequently, no such listing of all real numbers between 0 and 1 is possible, 
and hence, the set of real numbers between 0 and 1 is uncountable. 


1 .6 AXIOMATIC PROGRAM VERIFICATION 


Axiomatic program verification is used to prove that a sequence of programming instruc- 
tions achieves its specified objective. Semantic axioms for the programming language 
constructs are used in a formal logic argument as rules of inference. Comments called 
assertions, within the sequence of instructions, provide the main details of the argument. 
The presently high expense of creating verified software can be justified for code that 
is frequently reused, where the financial benefit is otherwise adequately large, or where 
human life is concerned, for instance, in airline traffic control. This section presents a 
representative sample of axioms for typical programming language constructs. 


1 .6.1 ASSERTIONS AND SEMANTIC AXIOMS 

The correctness of a program can be argued formally based on a set of semantic axioms 
that define the behavior of individual programming language constructs [F167], [Ho69], 
[Ap81]. (Some alternative proofs of correctness use denotational semantics [St 77], [Sc86] 
or operational semantics [We72].) In addition, it is possible to synthesize code, using 
techniques that permit the axioms to guide the selection of appropriate instructions 
[Di76], [Gr81]. Code specifications and intermediate conditions are expressed in the 
form of program assertions. 
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Definitions: 

An assertion is a program comment containing a logical statement that constrains the 
values of the computational variables. These constraints are expected to hold when 
execution flow reaches the location of the assertion. 

A semantic axiom for a type of programming instruction is a rule of inference that 
prescribes the change of value of the variables of computation caused by the execution 
of that type of instruction. 

The assertion false represents an inconsistent set of logical conditions. A computer 
program cannot meet such a specification. 

Given two constraints A and B on computational variables, a statement that B follows 
from A purely for reasons of logic and/or mathematics is called a logical implication. 

The postcondition for an instruction or program fragment is the assertion that imme- 
diately follows it in the program. 

The precondition for an instruction or program fragment is the assertion that imme- 
diately precedes it in the program. 

The assertion true represents the empty set of logical conditions. 

Notation: 

1. To say that whenever the precondition {Apre} holds, the execution of a program 
fragment called “Code” will cause the postcondition {Apost} to hold, the following 
notation styles can be used: 

• Horizontal notation: {Apre} Code {Apost} 

• Vertical notation: {Apre} 

Code 

{Apost}. 

• Flowgraph notation: 


\ 

. . . Apre 

Code 


\ 

. . . Apost 

/ 


2. Curly braces { ■ • • } enclose assertions in generic program code. They do not denote 
a set. 

3. Semantic axioms have a finite list of premises and a conclusion. They are represented 
in the following format: 

{Premise 1} 


{Premise n} 


{Conclusion} 

4. The circumstance that A logically implies B is denoted A=> B. 
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1 .6.2 NOP, ASSIGNMENT, AND SEQUENCING AXIOMS 

Formal axioms of pure mathematical consequence (no operation, from a computational 
perspective) and of straight-line sequential flow are used as auxiliaries to verify correct- 
ness, even of sequences of simple assignment statements. 

Definitions: 

A NOP (“no-op”) is a (possibly empty) program fragment whose execution does not 
alter the state of any computational variables or the sequence of flow. 

The Axiom of NOP states: 

{Apre} => {Apost} Premise 1 

{Apre} NOP {Apost} Conclusion 

Note : The Axiom of NOP is frequently applied to empty program fragments in order 
to facilitate a clear logical argument. 

An assignment instruction X := E ; means that the variable X is to be assigned the 
value of the expression E. 

In a logical assertion A(X) with possible instances of the program variable X, the 
result of replacing each instance of X in A by the program expression E is denoted 
A(X <— E). 

The Axiom of Assignment states: 

{true} No premises 

{A(X <— E)}X := E; {A(A)} Conclusion 

The following Axiom of Sequence provides that two consecutive instructions in the 
program code are executed one immediately after the other: 

{Apre} Codel {Amid} Premise 1 

{Amid} Code2 {Apost} Premise 2 

{Apre} Codel, Code2 {Apost} Conclusion 

(Commas are used as separators in program code.) 

Examples: 

1. Example of NOP: Suppose that A is a numeric program variable. 

{A = 3} => {A > 0} mathematical fact 

{A = 3} NOP {A > 0} by Axiom of NOP 

2. Suppose that A and Y are integer- type program variables. The Axiom of Assign- 
ment alone implies correctness of all the following examples: 

(a) {A = 4} A := A * 2; {A = 8} 

A( X) is {A = 8}; E is A *2; A{ X <— E) is {A *2 = 8}, which is equivalent to {A = 4}. 

(b) {true} X := 2; {A = 2} 

A( A) is {A = 2}; E is 2; A(X <— E) is {2 = 2}, which is equivalent to {true}. 

(c) {(—9 < A) A (A < 0)} Y := A; {(-9 < Y) A (Y < 0)} 

A(Y) is {(-9 < Y) A (Y < 0)}; E is A; A(Y <- E) is {(-9 < A) A (A < 0)}. 
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(d) {Y = 1} X := 0; {A = 1} 

A(X) is {Y = 1}; E is 0; A(X <- E) is {Y = 1}. 

(e) {false} X := 8; {X = 2} 

A(X') is {X = 2}; E is 8; A(X <— E) is {8 = 2}, which is equivalent to {false}. 


3. Examples of sequence: 

(a) {X = 1}X:=X + 1; {X > 0} 

i. {X = 1} => {X > -1} 

ii. {X = 1} NOP {X > -1} 

iii. {X >-1} X :=X + 1; {X > 0} 



iv. 

{A 

= 1} NOP, 

X := 

X q 

-i; 

{A > 

0} 


V. 

{A' 

= 1} A := 

X + l 

; {a 

>0} 


(b) 

{a = 

a A 

X = b} Z 

:= A; 

Y := 

= X 

A := 

= © 

i. 

{A = 

a A 

X = b} z 

= A; 


= a 

A X 

= b} 

ii. 

{Z = 

a A 

X = b}Y 

= A; 

{A = 

= a 

A Y 

= b} 

iii. 

{A = 

a A 

X = b} Z 

= A, 

Y := 

x 

{Z = 

a A 

iv. 


a A 

Y = b} X 

= A; 

{A 

= a 

A Y 

= b} 

V. 

{A = 

a A 

X = b} Z 

= A, 

Y := 

X , 

X := 

z, 


{A 

= a 

A Y = b} 







mathematics 
Axiom of NOP 
Axiom of Assignment 
Axiom of Sequence on ii, iii 
definition of NOP. 

{X = a A Y = b} 

Axiom of Assignment 
Axiom of Assignment 
Y = b} Axiom of Sequence on i, ii 
Axiom of Assignment 
Axiom of Sequence 
on iii, iv. 


1 .6.3 AXIOMS FOR CONDITIONAL EXECUTION CONSTRUCTS 
Definitions: 

A conditional assignment construct is any type of program instruction containing 
a logical condition and an imperative clause such that the imperative clause is to be 
executed if and only if the logical condition is true. Some types of conditional assignment 
contain more than one logical condition and more than one imperative clause. 

An if-then instruction if IfCond then ThenCode has one logical condition (which 
follows the keyword if) and one imperative clause (which follows the keyword then ) . 

The Axiom of If-then states: 

{Apre A IfCond} ThenCode {Apost} Premise 1 

{Apre A -JfCond} =>■ {Apost} Premise 2 

{Apre} if IfCond then ThenCode {Apost} Conclusion 



An if-then-else instruction if IfCond then ThenCode else ElseCode has one 

logical condition, which follows the keyword if, and two imperative clauses, one after 
the keyword then , and the other after the keyword else . 


© 2000 by CRC Press LLC 




The Axiom of If-then-else states: 

{Apre A IfCond} ThenCode {Apost} Premise 1 

{Apre A ->IfCond} ElseCode {Apost} Premise 2 


{Apre} if IfCond then ThenCode else ElseCode {Apost} Conclusion 



Examples: 


1 . If-then: 

{true} if X = 3 then Y := X; {X = 3 -> Y = 3} 

i. {X = 3} Y := A; {X = 3 A Y = 3} 

ii. {X = 3 A Y = 3} NOP {(X = 3) — > (F = 3)} 

(Step ii uses a logic fact: j>Ag=tp-M|) 

iii. {X = 3} F := X; {X = 3 -► Y = 3} 

(Step iii establishes Premise 1 for Ax. of If-then) 

iv. {-i(X = 3)} => {X = 3 -» y = 3} 

(Step iv establishes Premise 2 for Ax. of If-then) 

v. {true} if X = 3 then Y := X ; {X = 3 — > Y = 3} 


Axiom of Assignment 
Axiom of NOP 

Axiom of Sequence on i, ii 

Logic fact 

Axiom of If-then on iii, iv. 


2 . If-then-else: 

{X > 0} 

if (X > Y) then M := X- else M := Y; 

{(A > 0) A (X > Y — > M = X) A (X < Y — > M = Y)} 

i. {A>0 A X >Y} M := X; {X > 0 A (X >Y —* M = X) A (A<F-> M = Y)} 

by Axiom of Assignment and Axiom of NOP (establishes Premise 1) 

ii. {X > 0 A -.(X>y)} M := Y ; {X > 0 A (X > Y M = X) A (X <Y -> M=Y)} 

by Axiom of Assignment and Axiom of NOP (establishes Premise 2) 

iii. Conclusion now follows from Axiom of If-then-else. 


1 .6.4 AXIOMS FOR LOOP CONSTRUCTS 
Definitions: 

A while-loop instruction while WhileCond do LoopBody has one logical condi- 
tion called the while-condition , which follows the keyword while , and a sequence of 
instructions called the loop-body. At the outset of execution, the while condition is 
tested for its truth value. If it is true, then the loop body is executed. This two-step 
process of test and execute continues until the while condition becomes false, after which 
the flow of execution passes to whatever program instruction follows the while-loop. 
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A loop is weakly correct if whenever the precondition is satisfied at the outset of 
execution and the loop is executed to termination, the resulting computational state 
satisfies the postcondition. 

A loop is strongly correct if it is weakly correct and if whenever the precondition is 
satisfied at the outset of execution, the computation terminates. 

The Axiom of While defines weak correctness of a while-loop (i.e., the axiom ig- 
nores the possibility of an infinite loop) in terms of a logical condition called the loop 
invariant denoted “Looplnv” satisfying the following condition: 

{Apre} =>■ {Looplnv} “Initialization” Premise 

{Looplnv A WhileCond} LoopBody {Looplnv} “Preservation” Premise 

{Looplnv A -i WhileCond} => {Apost} “Finalization” Premise 


{Apre} while {Looplnv} WhileCond do LoopBody {Apost} Conclusion 



Example: 

1. Suppose that J, N, and P are integer- type program variables. 

{Apre :J = 0 A P = 1 A IV > 0} 
while {Looplnv : P = 2 J A J < N} ( J < N) do 
P:=P* 2; 

J := J + 1; 
endwhile 

{Apost : P = 2^} 

i. {Apre : J = 0 A P = 1 A N > 0} =>■ {Looplnv : P = 2 J A J < N} 

Initialization Premise trivially true by mathematics 

ii. {Looplnv A WhileCond : (P = 2 J A J < N) A ( J < N)} 

P := P* 2; 

J:=J + 1; 

{Looplnv : P = 2 J A J < N} 

Preservation Premise proved using by Axiom of Assignment twice 
and Axiom of Sequence 

iii. {Looplnv A WhileCond : (P = 2 J A J < N) A ->( J < N)} => {Apost : P = 2 W } 

Finalization Premise provable by mathematics 

iv. Conclusion now follows from Axiom of While. 

Fact: 

1. Proof of termination of a loop is usually achieved by mathematical induction. 
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1 .6.5 AXIOMS FOR SUBPROGRAM CONSTRUCTS 


The parameterless procedure is the simplest subprogram construct. Procedures with 
parameters and functional subprograms have somewhat more complicated semantic ax- 
ioms. 

Definitions: 

A procedure is a sequence of instructions that lies outside the main sequence of in- 
structions in a program. It consists of a procedure name , followed by a procedure 
body. 

A call instruction call ProcName is executed by transferring control to the first 
executable instruction of the procedure ProcName. 

A return instruction causes a procedure to transfer control to the executable instruction 
immediately following the most recently executed call to that procedure. An implicit 
return is executed after the last instruction in the procedure body is executed. It is 
good programming style to put a return there. 

In the following Axiom of Procedure ( parameterless ), Apre and Apost are the 
precondition and postcondition of the instruction call ProcName ; ProcPre and ProcPost 
are the precondition and postcondition of the procedure whose name is ProcName. 
{Apre} => {ProcPre} “Call” Premise 

{ProcPre} ProcBody {ProcPost} “Body” Premise 

{ProcPost} => {Apost} “Return” Premise 

{Apre} call ProcName; {Apost} Conclusion 



1 .7 LOGIC-BASED COMPUTER PROGRAMMING PARADIGMS 

Mathematical logic is the basis for several different computer software paradigms. These 
include logic programming, fuzzy reasoning, production systems, artificial intelligence, 
and expert systems. 


1.7.1 LOGIC PROGRAMMING 

A computer program in the imperative paradigm (familiar in languages like C, BASIC, 
FORTRAN, and ALGOL) is a list of instructions that describes a precise sequence 
of actions that a computer should perform. To initiate a computation, one supplies 
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the iterative program plus specific input data to the computer. Logic programming 
provides an alternative paradigm in which a program is a list of “clauses”, written in 
predicate logic, that describe an allowed range of behavior for the computer. To initiate 
a computation, the computer is supplied with the logic program plus another clause 
called a “goal” . The aim of the computation is to establish that the goal is a logical 
consequence of the clauses constituting the logic program. The computer simplifies the 
goal by executing the program repeatedly until the goal becomes empty, or until it 
cannot be further simplified. 

Definitions: 

A term in a domain S is either a fixed element of S or an ^-valued variable. 

An n-ary predicate on a set A is a function P : S n — > {T, F}. 

An atomic formula (or atom ) is an expression of the form P(t \, . . . , t n ), where n > 0, 
P is an n-ary predicate, and tn . . . , t n are terms. 

A formula is a logical expression constructed from atoms with conjunctions, disjunc- 
tions, and negations, possibly with some logical quantifiers. 

A substitution for a formula is a finite set of the form {vi/ti , . . . , v n /t n }, where each 
Vi is a distinct variable, and each ti is a term distinct from 'ty . 

The instance of a formula ip using the substitution 9 = {vi/ti, , v n /t n } is the formula 
obtained from ip by simultaneously replacing each occurrence of the variable ry in ip by 
the term t t . The resulting formula is denoted by ip9. 

A closed formula in logic programming is a program without any free variables. 

A ground formula is a formula without any variables at all. 

A clause is a formula of the form Vaq . . . Va: s (Ai V • • • V A n <— Bi A • • • A B m ) with no 
free variables, where s,n,m. > 0, and A’s and B’s are atoms. In logic programming, 
such a clause may be denoted by Ai, . . . , A n <— B\, . . . , B m . 

The head of a clause Ai , . . . , A n <— Bi, , B m is the sequence Ai, . . . , A n . 

The body of a clause A\, . . . , A n <— B\, . . . , B m is the sequence B\ B m . 

A definite clause is a clause of the form A 4— B i, . . . , B m or <— Bi, . . . , B m , which 
contains at most one atom in its head. 

An indefinite clause is a clause that is not definite. 

A logic program is a finite sequence of definite clauses. 

A goal is a definite clause <— Bi,. . . , B m whose head is empty. (Prescribing a goal for 
a logic program P tells the computer to derive an instance of that goal by manipulating 
the logical clauses in P.) 

An answer to a goal G for a logic program P is a substitution 6 such that GO is a 
logical consequence of P. 

A definite answer to a goal G for a logic program P is an answer in which every 
variable is substituted by a constant. 

Facts: 

1. A definite clause A <— Bi, . . . , B m represents the following logical constructs: 

If every Bi is true, then A is also true; 

Statement A can be proved by proving every Bi. 


© 2000 by CRC Press LLC 



2 . Definite answer property: If a goal G for a logic program P has an answer, then it 
has a definite answer. 

3 . The definite answer property does not hold for indefinite clauses. For example, 
although G = 3 xQ(x) is a logical consequence of P = {Q{a),Q(b) <— }, no ground 
instance of G is a logical consequence of P . 

4 . Logic programming is Turing-complete (§16.3); i.e. , any computable function can 
be represented using a logic program. 

5 . Building on the work of logician J. Alan Robinson in 1965, computer scientists 
Robert Kowalski and Alain Colmerauer of Imperial College and the University of Mar- 
seille-Aix, respectively, in 1972 independently developed the programming language 
PROLOG (PROgramming in LOGic) based on a special subset of predicate logic. 

6. The first PROLOG interpreter was implemented in ALGOL-W in 1972 at the Uni- 
versity of Marseille-Aix. Since then, several variants of PROLOG have been introduced, 
implemented, and used in practical applications. The basic paradigm behind all these 
languages is called Logic Programming. 

7 . In PROLOG, the relation “is” means equality. 

Examples: 

1. The following three clauses are definite: 

P<-Q,P P<- <- Q,R . 

2. The clause P, S <— Q, R is indefinite. 

3 . The substitution {X/a, Y/b} for the atom P(X, Y, Z) yields the instance P(a, b, Z). 

4 . The goal <— P to the program {P *— } has a single answer, given by the empty 
substitution. This means the goal can be achieved. 

5 . The goal <— P to the program {Q <— } has no answer. This means it cannot be 
derived from that program. 

6. The logic program consisting of the following two definite clauses PI and P2 com- 
putes a complete list of the pairs of vertices in an arbitrary graph that have a path 
joining them: 

PI. path(U V) <- 

P2. path(?7, V) <— path([/, W), edge(lU, V) 

Definite clauses P3 and P4 comprise a representation of a graph with nodes 1, 2, and 
3, and edges (1,2) and (2,3): 

P3. edge (1,2) <- 
P4. edge(2,3) <— 

The goal G represents a query asking for a complete list of the pairs of vertices in an 
arbitrary graph that have a path joining them: 

G. <— path(Y, Z) 

There are three distinct answers of the goal G to the logic program consisting of 
definite clauses PI to P4, corresponding to the paths (1,2), (1,2,3), and (2,3), respec- 
tively: 

Al. {Y/l,Z/2} 

A2. {y/l,Z/3} 

A3. {y/2,Z/3} 
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7. The following logic program computes the Fibonacci sequence 0, 1, 1, 2, 3, 5, 8, 13, , 
where the predicate fib(N, X) is true if X is the jVth number in the Fibonacci sequence: 

m o,o)<- 
m i,i) — 

fib(N, X + Y) <— N > 1, fib(N - 1, X),fib(N - 2, Y) 

The goal fib(6,X)” is answered {X/8}, the goal fib(X, 8)” is answered {Y/6}, 
and the goal fib(N,X )” has the following infinite sequence of answers: 

{N/0.X/0} 

{N/ 1,X/1} 

{N/ 2,X/1} 


8. Consider the problem of finding an assignment of digits (integers 0, 1, . . . , 9) to letters 
such that adding two given words produces the third given word, as in this example: 

SEND 
+ M O R E 
MONEY 

One solution to this particular puzzle is given by the following assignment: 

D = 0, E = 0, M = 1, N = 0, 0 = 0, R = 0, 5 = 9, Y = 0. 

The following PROLOG program solves all such puzzles: 

between(X, X, Z) <— X < Z. 

between(X, Y, Z ) <— between(K , Y, Z),X is K — 1. 

ua/([],0) 

val([X\Y], A) <— val(Y, B ), between^ 0, X, 9), A is 10 * B + X. 
solve(X, Y, Z) <— val(X , A ), val(Y, B), val(Z , C), C is A + B. 

The specific example given above is captured by the following goal: 

4 - solve{[D, N, E, S], [E, R, O, M ], [Y, E, N, 0,M ]). 

The predicate between{X,Y, Z) means X < Y < Z. The predicate val(L,N) 
means that the number N is the value of L, where L is the kind of list of letters that 
occurs on a line of these puzzles. The notation [X|L] means the list obtained by writing 
list L after item X. The predicate solve(X, Y, Z) means that the value of list Z equals 
the sum of the values of list X and list Y . 

This example illustrates the ease of writing logic programs for some problems where 
conventional imperative programs are more difficult to write. 


1 .7.2 FUZZY SETS AND LOGIC 

Fuzzy set theory and fuzzy logic are used to model imprecise meanings, such as “tall” , 
that are not easily represented by predicate logic. In particular, instead of assigning 
either “true” or “false” to the statement “John is tall”, fuzzy logic assigns a real number 
between 0 and 1 that indicates the degree of “tallness” of John. Fuzzy set theory assigns 
a real number between 0 and 1 to John that indicates the extent to which he is a member 
of the set of tall people. See [Ka86], [Ka92], [KaLa94], [YaFi94], [YaZa94], [Za65], [Zi91], 
Zi9.3 . 
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Definitions: 


A fuzzy set F — (A, p) consists of a set X (the domain) and a membership function 
p: X —> [0, 1]. Sometimes the set is written { (x, p(x)) \ x £ X } or { p{x) x \ x £ X }. 

The fuzzy intersection of fuzzy sets (A, p a ) and (B, p B ) is the fuzzy set An B with 
domain AnB and membership function Paob{x) = min (pa(x), ps(x)). 

The fuzzy union of fuzzy sets (A, pa) and ( B , p B ) is the fuzzy set AUB with domain 
AuB and membership function paub{x) = max(^(a:), p B (x)). 

The fuzzy complement of the fuzzy set (A, p) is the fuzzy set ->A or A with domain A 
and membership function p-y(x) = 1 — p(x). 

The nth constructor con(p,n) of a membership function p is the function p n . That 
is, con(p,n)(x) = ( p(x)) n . 

The nth dilutor dil(p,n) of a membership function p is the function p 1 / n . That is, 
dil(p,n)(x) = (p^x)) 1 / 71 . 

A T-norm operator is a function /: [0, 1] x [0, 1] — > [0, 1] with the following properties: 

• f(x, y) = f(y, x) commutativity 

• f(f(x, y),z) = f(x, f(y, z)) associativity 

• if x < v and y < w, then f{x , y) < f(v, w) monotonicity 

• f(a, 1) = a. 1 is a unit element 

The fuzzy intersection A n / B of fuzzy sets (A, pa) and f/i. ///>) relative to the 
T-norm operator f is the fuzzy set with domain A n B and membership function 
PAn f B{x) = f(pA(x),p B ( x)). 

An S-norm operator is a function /: [0, 1] x [0, 1] — » [0, 1] with the following properties: 

• f(x, y) = f{y , x) commutativity 

• f(f(x, y),z) = f{x, f(y, z)) associativity 

• if x < v and y < w, then f{x , y) < f(v, w) monotonicity 

• /(a, 1) = 1. 

The fuzzy union AU/B of fuzzy sets (A, pa) and ( B , pi>) relative to the S-norm 
operator f is the fuzzy set with domain AUB and membership function pAUf B (x) = 
f{p A {x),p B {x)). 

A complement operator is a function /: [0, 1] — > [0, 1] with the following properties: 

• /( 0 ) = 1 

• if x < y then /( x) > f(y) 

• f(f(x)) = x. 

The fuzzy complement — fA of the fuzzy set (A, p) relative to the complement 

operator f is the fuzzy set with domain A and membership function p~, f ( x ) = f{p(x)). 

A fuzzy system consists of a base collection of fuzzy sets, intersections, unions, com- 
plements, and implications. 

A hedge is a monadic operator corresponding to linguistic adjectives such as “very”, 
“about”, “somewhat”, or “quite” that modify membership functions. 

A two-valued logic is a logic where each statement has exactly one of the two values: 
true or false. 
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A multi-valued logic ( n-valued logic ) is a logic with a set of n (> 2) truth values; 

i.e. , there is a set of n numbers V\,V 2 , ■ ■ ■ ,v n £ [0,1] such that every statement has 
exactly one truth value ry. 

Fuzzy logic is the study of statements where each statement has assigned to it a truth 
value in the interval [0, 1] that indicates the extent to which the statement is true. 

If statements p and q have truth values v\ and V 2 respectively, the truth value of p V q 
is max(ui, V 2 ), the truth value of p A q is min(ui, V 2 ), and the truth value of ->p is l—v\. 


Facts: 

1. Fuzzy set theory and fuzzy logic were developed by Lofti Zadeh in 1965. 

2 . Fuzzy set theory and fuzzy logic are parallel concepts: given a predicate P(x), the 
fuzzy truth value of the statement P(a) is the fuzzy set value assigned to a as an element 
of { x | P(x) }. 

3 . The usual minimum function min©, y) is a T-norm. The usual real maximum 
function max(j), y ) is an S-norm. The function c{x) = 1 — x is a complement operator. 

4 . Several other kinds of T-norms, S-norms, and complement operators have been 
defined. 

5 . The words “T-norm” and “S-norm” come from multi-valued logics. 

6. The only difference between T-norms and S-norms is that the T-norm specifies 
f(a, 1) = a, whereas the S-norm specifies /(a, 1) = 1. 

7 . Several standard classes of membership functions have been defined, including step, 
sigmoid, and bell functions. 

8. Constructors and dilutors of membership functions are also membership functions. 

9 . The large number of practical applications of fuzzy set theory can generally be 
divided into three types: machine systems, human-based systems, human-machine sys- 
tems. Some of these applications are based on fuzzy set theory alone and some on 
a variety of hybrid configurations involving neurofuzzy approaches, or in combination 
with neural networks, genetic algorithms, or case-based reasoning. 

10 . The first fuzzy expert system that set a trend in practical fuzzy thinking was the 
design of a cement kiln called Linkman, produced by Blue Circle Cement and SIRA 
in Denmark in the early 1980s. The system incorporates the experience of a human 
operator in a cement production facility. 

11 . The Sendai Subway Automatic Train Operations Controller was designed by Hi- 
tachi in Japan. In that system, speed control during cruising, braking control near sta- 
tion zones, and switching of control are determined by fuzzy IF-THEN rules that process 
sensor measurements and consider factors related to travelers’ comfort and safety. In 
operation since 1986, this most celebrated application encouraged many applications 
based on fuzzy set controllers in the areas of home appliances (refrigerators, vacuum 
cleaners, washers, dryers, rice cookers, air conditioners, shavers, blood-pressure measur- 
ing devices), video cameras (including fuzzy automatic focusing, automatic exposure, 
automatic white balancing, image stabilization), automotive (fuzzy cruise control, fuel 
injection, transmission and brake systems), robotics, and aerospace. 

12 . Applications to finance started with the Yamaichi Fuzzy Fund, which is a fuzzy 
trading system. This was soon followed by a variety of financial applications world-wide. 
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13 . Research activities will soon result in commercial products related to the use of 
fuzzy set theory in the areas of audio and video data compression (such as HDTV), 
robotic arm movement control, computer vision, coordination of visual sensors with 
mechanical motion, aviation (such as unmanned platforms), and telecommunication. 

14 . Current status : Most applications of fuzzy sets and logic are directly related to 
structured numerical model-free estimators. Presently, most applications are designed 
with linguistic variables, where proper levels of granularity are being used in the evalu- 
ations of those variables, expressing the ambiguity and subjectivity in human thinking. 
Fuzzy systems capture expert knowledge and through the processing of fuzzy IF-THEN 
rules are capable of processing knowledge combining the antecedents of each fuzzy rule, 
calculating the conclusions, and aggregating them to the final decision. 

15 . One way to model fuzzy implication A — > 5 is to define A — > 5 as ~> C A U/ 5 
relative to some complement operator c and to some S-norm operator /. Several other 
ways have also been considered. 

16 . A fuzzy system is used computationally to control the behavior of an external 
system. 

17. Large fuzzy systems have been used in specifying complex real-world control sys- 
tems. The success of such systems depends crucially on the specific engineering pa- 
rameters. The correct values of these parameters are usually obtained by trial-and- 
readjustment. 

18 . A two-valued logic is a logic that assumes the law of the excluded middle: p V ->p 
is a tautology. 

19 . Every n - valued logic is a fuzzy logic. 

Examples: 

1. A committee consisting of five people met ten times during the past year. Person A 
attended 7 meetings, 5 attended all 10 meetings, C attended 6 meetings, D attended no 
meetings, and E attended 9 meetings. The set of committee members can be described 
by the following fuzzy set that reflects the degree to which each the members attended 
meetings, using the function p: {A, B, C, D , E} — » [0, 1] with the rule p(x) = A (number 
of meetings attended): 

{(A, 0.7), (5, 1.0), (C, 0.6), (5, 0.0), (5, 0.9)}, 
which can also be written as 

{0.7A, 1.0 B, 0.6 C, 0.0 5, 0.95}. 

Person B would be considered a “full” member and person D a “nonmember”. 

2 . Four people are rated on amount of activity in a political party, yielding the fuzzy 
set 

Pi = {0.8A, 0.455, 0.10 0.755}, 

and based on their degree of conservatism in their political beliefs, as 

P 2 = {0.6A, 0.855, 0.7O 0.355}. 

The fuzzy union of the sets is 

Pi U P 2 = {0.8A, 0.85P, 0.7O 0.755}, 

the fuzzy intersection is 

Pi n P 2 = {0.6A, 0.455, 0.1O 0.355} 
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and the fuzzy complement of P\ (measurement of political inactivity) is 

Pi = {0.2^4, 0.55.B, 0.9C, 0.25D}. 


3. In the fuzzy set with domain T and membership function 

fO if h < 170 

HT(h) = l if 170 < h < 190 

l 1 otherwise 

the number 160 is not a member, the number 195 is a member, and the membership 
of 182 is 0.6. The graph of gr is given in the following figure. 



4. The fuzzy set (T, /i t ) of Example 3 can be used to define the fuzzy set “Tall” 

= of tall people, by the rule hh{x) = ht { height. (x)) where height(x ) is the 

height of person x calibrated in centimeters. 

5. The second constructor con(/j,H, 2) of the fuzzy set “Tall” can be used to define a 
fuzzy set “Quite tall”, whose graph is given in the following figure. 



6. The second dilutor dilfau, 2) of the fuzzy set “Tall” defines the fuzzy set “Somewhat 
tall”, whose graph is given in the following figure. 
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7. The concept of “being healthy” can be modeled using fuzzy logic. The truth 
value 0.95 could be assigned to “Fran is healthy” if Fran is almost always healthy. 
The truth value 0.4 could be assigned to “Leslie is healthy” if Leslie is healthy some- 
what less than half the time. The truth of the statements “Fran and Leslie are healthy” 
would be 0.4 and “Fran is not healthy” would be 0.05. 

8. Behavior closed-loop control systems: The behavior of some closed-loop control 
systems can be specified using fuzzy logic. For example, consider an automated heater 
whose output setting is to be based on the readings of a temperature sensor. A fuzzy set 
“cold” and the implication “very cold — > high” could be used to relate the temperature 
to the heater settings. The exact behavior of this system is determined by the degree of 
the constructor used for “very” and by the specific choices of S-norm and complement 
operators used to define the fuzzy implication — the “engineering parameters” of the 
system. 


1.7.3 PRODUCTION SYSTEMS 

Production systems are a logic-based computer programming paradigm introduced by 
Allen Newell and Herbert Simon in 1975. They are commonly used in intelligent systems 
for representing an expert’s knowledge used in solving some real-world task, such as a 
physician’s knowledge of making medical diagnoses. 

Definitions: 

A fact set is a set of ground atomic formulas. These formulas represent the information 
relevant to the system. 

A condition is a disjunction Ax V • • • V A n , where n > 0 and each A t is a literal. 

A condition C is true in a fact set S if: 

• C is empty, or 

• C is a positive literal and C £ S, or 

• C is a negative literal ->A, and B / £> for each ground instance B of A, or 

• C = Ax V • • • V A n , and some condition A t is true in S. 

A print command “print (x)”, means that the value of the term x is to be printed. 
An action is either a literal or a print command. 

A production rule is of the form Ci , . . . , C„ — > Ax , . . . , A m , where n, m > 1, each C t 
is a condition, each A, is an action, and each variable in each action appears in some 
positive literal in some condition. 

The antecedent of the rule C\, ... , C n — > Ax, . . . , A m is C\, , C n . 

The consequent of the rule Cx, ■ ■ ■ , C n — > A±, . . . , A m is Ax, , A m . 

An instantiation of a production rule is the rule obtained by replacing each variable 
in each positive literal in each condition of the rule by a constant. 

A production system consists of a fact set and a set of production rules. 
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Facts: 


1. Given a fact set S, an instantiation Ci, ... ,C n — > A 1 , . . . , zl m of a production rule 
denotes the following operation: 

if each condition © is true in S then 
for each A,;. 

if A t is an atom, add it to S 

if A t is a negative literal ~<B, then remove B from S 
if A t is “print (c)”, then print c. 

2 . In addition to “print”, production systems allow several other system- level com- 
mands. 

3 . OPS5 and CLIPS are currently the most popular languages for writing production 
systems. They are available for most operating systems, including UNIX and DOS. 

4 . To initialize a computation prescribed by a production system, the initial fact set 
and all the production rules are supplied as input. The command “runl” non-deter- 
ministically selects an instantiation of a production rule such that all conditions in the 
antecedent hold in the fact set, and it “fires” the rule by carrying out the actions in the 
consequent. The command “run” keeps on selecting and firing rules until no more rule 
instantiations can be selected. 

5. Production systems are Turing complete. 

Examples: 

1. The fact set S = {1V(3),3 > 2, 2 > 1} may represent that “3 is a natural number”, 
that “3 is greater than 2”, and that “2 is greater than 1”. 

2. If the fact set S of Example 1 and the production N(x) — » print (x) are supplied as 
input, the command “run” will yield the instantiation N(3) — > print(3) and fire it to 
print 3. 

3 . The production rule N(x),x > y — > ->N(x),N(y) has N( 3), 3 > 2 — > ->iV(3), 1V(2) 
as an instantiation. If operated on fact set S of Example 1, this rule will change S to 
{3 > 2, 2 > 1, 1V(2)}. 

4 . The production system consisting of the following two production rules can be used 
to add a set of numbers in a fact set: 

->S(x) -* 5(0) 

S(x),N(y) -» ->S(x),->N(y),S(x + y). 

For example, starting with the fact set {N (1) , N (2) , N (3) , N (A)} , this production system 
will produce the fact set {5(10)}. 


1.7.4 AUTOMATED REASONING 

Computers have been used to help prove theorems by verifying special cases. But 
even more, they have been used to carry out reasoning without external intervention. 
Developing computer programs that can draw conclusions from a given set of facts 
is the goal of automated reasoning. There are now automated reasoning programs 
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that can prove results that people have not been able to prove. Automated reasoning 
can help in verifying the correctness of computer programs, verifying protocol design, 
verifying hardware design, creating software using logic programming, solving puzzles, 
and proving new theorems. 

Definitions: 

Automated reasoning is the process of proving theorems using a computer program 
that can draw conclusions which follow logically from a set of given facts. 

A computer-assisted proof is a proof that relies on checking the validity of a large 
number of cases using a special purpose computer program. 

A proof done by hand is a proof done by a human without the use of a computer. 

Facts: 

1. Computer-assisted proofs have been used to settle several well-known conjectures, 
including the Four Color Theorem (§8.6.4) and the nonexistence of a finite projective 
plane of order 10 (§12.2.3). 

2. The computer-assisted proofs of both the Four Color Theorem and the nonexistence 
of a finite projective plane of order 10 rely on having a computer verify certain facts 
about a large number of cases using special purpose software. 

3. Hardware, system software, and special purpose program errors can invalidate a 
computer-assisted proof. This makes the verification of computer-assisted proofs im- 
portant. However, such verification may be impractical. 

4. Automated reasoning software has been developed for both first-order and higher- 
order logics. A database of automated reasoning systems can be found at 

http : //www-f ormal . Stanford. edu: 80/clt/ARS/ systems .html 

5. Automated reasoning software has been used to prove new results in many areas, 
including settling long-standing, well-known, open conjectures (such as the Robbins 
problem described in Example 2) . 

6. Proofs generated by automated reasoning software can usually be checked without 
using computers or by using software programs that check the validity of proofs. 

7. Proofs done by humans often use techniques ill-suited for implementation in auto- 
mated proof software. 

8. Automatic proof systems rely on proof procedures suitable for computer implemen- 
tation, such as resolution and the semantic tableaux procedure. (See [Fi96] or [Wo96] 
for details.) 

9. The effectiveness of automatic proof systems depends on following strategies that 
help programs prove results efficiently. 

10. Restriction strategies are used to block paths of reasoning that are considered to 
be unpromising. 

11. Direction strategies are used to help programs select the approaches to take next. 

12. Look-ahead strategies let programs draw conclusions before they would ordinarily 
be drawn following the basic rules of the program. 
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13 . Redundancy-control strategies are used to eliminate some of the redundancy in 
retained information. 

14 . There are efforts underway to capture all mathematical knowledge into a database 
that can be used in automated reasoning systems (see the information about the QED 
system in Example 3). 


Examples: 

1. The OTTER system is an automated reasoning system for first order logic developed 
at Argonne National Laboratory [Wo96]. OTTER has been used to establish many 
previously unknown results in a wide variety of areas, including algebraic curves, lattices, 
Boolean algebra, groups, semigroups, and logic. A summary of these results can be 
found at 

http : //www.mcs . anl ,gov/home/mccune/ar/new_results 

2 . The automated reasoning system EQP, developed at Argonne National Laboratory, 
settled the Robbins problem in 1996. This problem was first proposed in the 1930s by 
Herbert Robbins, and was actively worked on by many mathematicians. The Robbins 
problem can be stated as follows. Can the equivalence 

P 

be derived from the commutative and associative laws for the “or” operator V and the 
identity 

V q) V -i(p V -ig)) <t=> pi 

The EQP system, using some earlier work that established a sufficient condition for the 
truth of Robbins’ problem, found a 15-step proof of the theorem after approximately 8 
days of searching on a UNIX workstation when provided with one of several different 
search strategies. 

3 . The goal of the QED Project is to build a repository that represents all important, 
established mathematical knowledge. It is designed to help mathematicians cope with 
the explosion of mathematical knowledge and help in developing and verifying computer 
systems. 
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INTRODUCTION 


Many problems in mathematics, computer science, and engineering involve counting 
objects with particular properties. Although there are no absolute rules that can be 
used to solve all counting problems, many counting problems that occur frequently can 
be solved using a few basic rules together with a few important counting techniques. 
This chapter provides information on how many standard counting problems are solved. 


GLOSSARY 

binomial coefficient: the coefficient ())) of x k y n ~ k in the expansion of (x + y) n . 

coloring pattern (with respect to a set of symmetries of a figure): a set of mutually 
equivalent colorings. 

combination (from a set S): a subset of S\ any unordered selection from S. A k- 
combination from a set is a subset of k elements of the set. 

combination coefficient: the number C(n,k ) (equal to (?)) of ways to make an 
unordered choice of k items from a set of n items. 

combination- with-replacement (from a set S): any unordered selection with re- 
placement; a multiset of objects from S. 

combination-with-replacement coefficient: the number of ways to choose a mul- 
tiset of k items from a set of n items, written C R (n, k). 

cycle index: for a permutation group G, the multivariate polynomial Pq obtained by 
dividing the sum of the cycle structure representations of all the permutations in G 
by the number of elements of G. 

cycle structure (of a permutation): a multivariate monomial whose exponents record 
the number of cycles of each size. 

derangement: a permutation on a set that leaves no element fixed. 

exponential generating function (for { «/,■ } q° ) : the formal sum (Cfclo Uk % ■ or ail y 
equivalent closed- form expression. 

falling power: the product x- = x(x—l ) . . . (x—k+ 1) of k consecutive factors starting 
with x, each factor decreasing by 1. 

Ferrers diagram: a geometric, left-justified, and top-justified array of cells, boxes, 
dots or nodes representing a partition of an integer, in which each row of dots 
corresponds to a part of the partition. 

Gaussian binomial coefficient: the algebraic expression ' ? ] in the variable q defined 
for nonnegative integers n and k by [£] = ypy • q q 2 P 1 1 • • • q + fc _© 1 for 0 < k < n 
and [£] = 1. 

generating function (or ordinary generating function) for { a/, : } q ° : the formal 
sum £r=o Q k xk , or any equivalent closed-form expression. 

hook (of a cell in a Ferrers diagram): the set of cells directly to the right or directly 
below a given cell, together with the cell itself. 

hooklength (of a cell in a Ferrers diagram): the number of cells in the hook of that 
cell. 

Kronecker delta function: the function 6(x,y) defined by the rule S(x,y) = 1 if 
x = y and 0 otherwise. 
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lexicographic order: the order in which a list of strings would appear in a dictionary. 

Mobius function : the function /r(m) where 
( 1 if to = 1 

/Li(m) = < (— l) fe if to is a product of /c distinct primes 

l 0 if to is divisible by the square of a prime, 

or a generalization of this function to partially ordered sets. 

multinomial coefficient : the coefficient k ” k ) of ■ ■ ■ x in the expan- 

sion of (Xi + X 2 + ' ' ' + X m ) n . 

ordered selection (of k items from a set S): a nonrepeating list of k items from S. 

ordered selection with replacement (of k items from a set S ): a possibly-repeating 
list of k items from S. 

ordinary generating function (for the sequence {afc}§°): See generating function. 

partially ordered set (or poset ): a set S together with a binary relation < that is 
reflexive, antisymmetric, and transitive, written (S,<). 

partition: an unordered decomposition of an integer into a sum of positive integers. 

Pascal’s triangle: a triangular table with the binomial coefficient (£) appearing in 
row n, column k. 

pattern inventory: a generating function that enumerates the number of coloring 
patterns. 

permutation: a one-to-one mapping of a set of elements onto itself, or an arrangement 
of the set into a list. A fc-permutation of a set is an ordered nonrepeating sequence 
of k elements of the set. 

permutation coefficient: the number of ways to choose a nonrepeating list of k items 
from a set of items, written P(?r, k). 

permutation group: a nonempty set P of permutations on a set S, such that P is 
closed under composition and under inversion. 

permutation-with-replacement coefficient: the number of ways to choose a pos- 
sibly repeating list of k items from a set of n items, written P R (n, k). 

poset: See partially ordered set. 

probleme des menages: the problem of finding the number of ways that married 
couples can be seated around a circular table so that no men are adjacent, no women 
are adjacent, and no husband and wife are adjacent. 

probleme des rencontres: given balls 1 through n drawn out of an urn one at a 
time, the problem of finding the probability that ball i is never the itli one drawn. 

Stirling cycle number: the number [£] of ways to partition n objects into k non- 
empty cycles. 

Stirling number of the first kind : the coefficient s(n, k) of x k in the polynomial 
x(x — l)(x — 2) . . . (x — n + 1). 

Stirling number of the second kind: the coefficient S(n,k ) of x- in the represen- 
tation x n = J2k k)x- of x n as a linear combination of falling powers. 

Stirling subset number: the number {^} of ways to partition n objects into k 
nonempty subsets. 

symmetry (of a figure) : a spatial motion that maps the figure onto itself. 
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tree diagram : a tree that displays the different alternatives in some counting process. 
unordered selection (of k items from a set S): a subset of k items from S. 

unordered selection (of k items from a set S with replacement): a selection of k 
objects in which each object in the selection set S can be chosen arbitrarily often 
and such that the order in which the objects are selected does not matter. 

Young tableau: an array obtained by replacing each cell of a Ferrers diagram by a 
positive integer. 


2.1 SUMMARY OF COUNTING PROBLEMS 

Table 1 lists many important counting problems, gives the number of objects being 
counted, together with a reference to the section of this Handbook where details can be 
found. Table 2 lists several important counting rules and methods, and gives the types 
of counting problems that can be solved using these rules and methods. 


Table 1 Counting problems. 

The notation used in this table is given at the end of the table. 


objects 

number of objects 

reference 

Arranging objects in a row: 



n distinct objects 

n! = P(n, n) = n{n — 1) ... 2 • 1 

§2.3.1 

k out of n distinct objects 

n k = P(n , k) = n(n— 1) . . . (n—k+1) 

§2.3.1 

some of the n objects are identical: 
k\ of a first kind, &2 of a second 
kind, . . . , kj of a jth kind, and 
where k\ + k -2 + ■ ■ ■ + kj = n 

/ n \ n\ 

\k±k2---kjJ ki\k2---kj\ 

§2.3.2 

none of the n objects remains in its 
original place (derangements) 

D n = n\( !-£+■■ ■+(-!)"£) 

§2.4.2 

Arranging objects in a circle (where rotations, but not reflections, are 

equivalent): 

n distinct objects 

(n- 1)! 

§2.2.1 

k out of n distinct objects 

P(n,k) 

k 

§2.2.1 

Choosing k objects from n distinct objects: 


order matters, no repetitions 

PM = ^ = n k 

§2.3.1 

order matters, repetitions allowed 

P R (n , k) = n k 

§2.3.3 

order does not matter, no repeti- 
tions 

0 
IT 

II 

II 

?r 

S' s 

1 =• 

?r 

§2.3.2 

order does not matter, repetitions 
allowed 

C R (n,k)=( k+ l~ 1 ) 

§2.3.3 
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objects 

number of objects 

reference 

Subsets: 



of size k from a set of size n 

C) 

§ 2 . 3.2 

of all sizes from a set of size n 

2 ” 

§ 2 . 3.4 

of {1, . . . , n}, without consecutive 

Fn+ 2 

§ 3 . 1.2 

elements 



Placing n objects into k cells: 



distinct objects into distinct cells 

k n 

§ 2 . 2.1 

distinct objects into distinct cells, 

{ n k }»- 

§ 2 . 5.2 

no cell empty 

distinct objects into identical cells 

{?}+{5}+-+{n= s - 

§ 2 . 5.2 

distinct objects into identical cells, 

ii) 

§ 2 . 5.2 

no cell empty 


distinct objects into distinct cells, 

( n ) 

\ k± K2 ••• kj ) 

§ 2 . 3.2 

with hi in cell i (i = 1 , ,n), 
and where k\ + + • — b kj = n 



identical objects into distinct cells 

rt 1 ) 

§ 2 . 3.3 

identical objects into distinct cells, 

(fc-i) 

§ 2 . 3.3 

no cell empty 

identical objects into identical 

Pk(n) 

§ 2 . 5.1 

cells 



identical objects into identical 

Vk{n ) -p k -i(n) 

§ 2 . 5.1 

cells, no cell empty 



Placing n distinct objects into k 
nonempty cycles 

l n k\ 

§ 2 . 5.2 

Solutions to X\ + • • • + x n = k: 



nonnegative integers 

//c+n— 1\ /fc+n— 1\ 

V k ) ~ V n— 1 ) 

§ 2 . 3.3 

positive integers 

e:l) 

§ 2 . 3.3 

integers where 0 < a* < Xi for all i 

^ k — (oiH \-a n )+n— 1 ^ 

§ 2 . 3.3 

integers where 0 < Xi < a,i for one 

inclusion/exclusion principle 

§ 2 . 4.2 

or more i 



integers where Xi > ■ ■ ■ > x n > 1 

p n (k) -p„_i(/c) 

§ 2 . 5.1 

integers where X\ > ■ ■ ■ > x n > 0 

Pn{k) 

§ 2 . 5.1 

Solutions to Xi + X 2 + ■■■ + x n = 

p{n) 

§ 2 . 5.1 

n in nonnegative integers where 

X\ > X 2 > • • • > x n > 0 



Solutions to X\ + 2x2 + 3^3 + • • • + 

p{n) 

§ 2 . 5.1 

nx n = n in nonnegative integers 
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objects 

number of objects 

reference 

Functions from a fc-element set to an 

n-element set: 


all functions 

n k 

§2.2.1 

one-to-one functions ( n > k) 

n^=jAy=P{n,k) 

§2.2.1 

onto functions (n < k) 

inclusion /exclusion 

§2.4.2 

partial functions 

(o) + (i) n +(2') n2H h (fc) nfc 

= (n+l) fc 

§2.3.2 

Bit strings of length n: 



all strings 

2™ 

§2.2.1 

with given entries in k positions 

c^n—k 

§2.2.1 

with exactly k Os 

© 

§2.3.2 

with at least k Os 


§2.3.2 

with equal numbers of Os and Is 

L%) 

§2.3.2 

palindromes 

2 [n/2] 

§2.2.1 

with an even number of Os 

2 n ~ 1 

§2.3.4 

without consecutive Os 

Fn+ 2 

§3-1.2 

Partitions of a positive integer n into positive summands: 

§2.5.1 

total number 

Pin) 


into at most k parts 

Pk{n) 


into exactly k parts 

Pk(n) -pk-i(n) 


into parts each of size < k 

Pk{n) 


Partitions of a set of size n : 



all partitions 

B{n) 

§2.5.2 

into k parts 

{D 

§2.5.2 

into k parts, each part having at 
least 2 elements 

b(n , k) 

§3.1.8 

Paths: 



from (0,0) to (2n, 0) made up of 
line segments from ( i , yi) to (i + 
1 , 2 /i+i), where integer y. t > 0, 
Ui+i = Vi ± i 

C n 

§3.1.3 

from (0,0) to (2n, 0) made up of 
line segments from ( i , y.j) to ( i + 
1,2/i+i), where integer y t > 0 
(for 0 < * < 2 n), y i+1 =yi± 1 

C „- 1 

§3.1.3 

from (0,0) to (■ m,n ) that move 1 
unit up or right at each step 

( m r) 

§2.3.2 
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objects 

number of objects 

reference 

Permutations of {1, . . . ,n}: 



all permutations 

n\ 

§2.3.1 

with k cycles, all cycles of length 
> 2 

d(n , k) 

§3.1.8 

with k descents 

E(n, k ) 

§3.1.5 

with k excedances 

E(n , k ) 

§3.1.5 

alternating, n even 

(-1)"/ 2 E n 

§3.1.7 

alternating, n odd 

T n 

§3.1.7 

Symmetries of regular figures: 


§2.6 

n-gon 

2 n 


tetrahedron 

12 


cube 

24 


octahedron 

24 


dodecahedron 

60 


icosahedron 

60 


Coloring regular 2-dimensional & 3-dimensional figures with < k colors: 

§2.6 

corners of an ?z-gon, allowing rota- 
tions and reflections 

i n ^Z l P( d ) kd + \ k 2 » 

n odd; 



i^2 ( p( d ) kd +\( k2 + k 2 ). 



d\n n even 


corners of an n-gon, allowing only 
rotations 

d\n 


corners of a triangle, allowing ro- 
tations and reflections 

| [k 3 + 3 k 2 + 2k] 


corners of a triangle, allowing only 
rotations 

| [fc 3 + 2k] 


corners of a square, allowing rota- 
tions and reflections 

l[k 4 + 2k 3 + 3k 2 + 2k] 


corners of a square, allowing only 
rotations 

\[k 4 + k 2 + 2k] 


corners of a pentagon, allowing 
rotations and reflections 

±[k 5 + 5k 3 + 4k] 


corners of a pentagon, allowing 
only rotations 

\[k 3 + 4k] 
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objects 

number of objects 

reference 

corners of a hexagon, allowing ro- 
tations and reflections 

^ [ k 6 + 3fc 4 + 4 k 3 + 2 k 2 + 2k] 


corners of a hexagon, allowing 
only rotations 

l[k 6 + k 3 + 2k 2 + 2k) 


corners of a tetrahedron 

^[fc 4 + llfc 2 ] 


edges of a tetrahedron 

P [ k 6 + 3fc 4 + 8fc 2 ] 


faces of a tetrahedron 

jC[fc 4 + life 2 ] 


corners of a cube 

P[k 8 + 17 k 4 + 6k 2 } 


edges of a cube 

^j[fc 12 + 6 fc 7 + 3 k 6 + 8 k 4 + 6/c 3 ] 


faces of a cube 

^[fc 6 + 3fc 4 + 12fc 3 + 8fc 2 ] 


Number of sequences of wins/losses 
in a ^Ji-out-of -n playoff series 
(n odd) 

2 C(n, =±!) 

§2.3.2 

Sequences a i, ■ ■ ■ , a 2 n with n Is and 
n — Is, and each partial sum ai+ 

■ ' ' CLk ^ 0 

C n 

§3.1.3 

Well- formed sequences of parenthe- 
ses of length 2 n 

C n 

§3.1.3 

Well-parenthesized products of n + 

1 variables 

Cn 

§3.1.3 

Triangulations of a convex (n + 2)- 
gon 

C n 

§3.1.3 


Notation: 

B{n) or B n : Bell number 

b(n, k): associated Stirling number of the 
second kind 

C n = qfjr i ( 2 ^*) : Catalan number 

C{n,k ) = (^) = fc!( "i fc)! : binomial coefficient 

d(n,k): associated Stirling number of 
the first kind 

E n : Euler number 

ip: Euler phi-function 

E(n,k): Eulerian number 

F n : Fibonacci number 


n- = n(n — 1) . . . (n — k + 1) = P{n, k) 
falling power 

P(n,k) = ( n ^ k y ■ fc-permutation 

p(n): number of partitions of n 

Pk(n): number of partitions of n into 
at most k summands 

p%{n): number of partitions of n into 
exactly k summands 

[ ™ ] : Stirling cycle number 

{ } : Stirling subset number 

T n : tangent number 
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Table 2 Methods of counting and the problems they solve. 

statement 

technique of proof 

rule of sum 
(§2.2.1) 

problems that can be broken into disjoint cases, each of 
which can be handled separately 

rule of product 
(§2.2.1) 

problems that can be broken into sequences of indepen- 
dent counting problems, each of which can be solved 
separately 

rule of quotient 
(§2.2.1) 

problems of counting arrangements, where the arrange- 
ments can be divided into collections that are all of the 
same size 

pigeonhole principle 
(§2.2.3) 

problems with two sets of objects, where one set of objects 
needs to be matched with the other 

inclusion/exclusion 
principle (§2.4) 

problems that involve finding the size of a union of sets, 
where some or all the sets in the union may have com- 
mon elements 

permutations 
(§2.2.1, 2.3.1, 2.3.3) 

problems that require counting the number of selections 
or arrangements, where order within the selection or 
arrangement matters 

combinations 
(§2.3.2, 2.3.3) 

problems that require counting the number of selections 
or sets of choices, where order within the selection does 
not matter 

recurrence relations 
(§2.3.6) 

problems that require an answer depending on the inte- 
ger n, where the solution to the problem for a given 
size n can be related to one or more cases of the prob- 
lem for smaller sizes 

generating functions 
(§2.3.7) 

problems that can be solved by finding a closed form for 
a function that represents the problem and then manip- 
ulating the closed form to find a formula for the coeffi- 
cients 

Polya counting 
(§2.6.5) 

problems that require a listing or number of patterns, 
where the patterns are not to be regarded as different 
under certain types of motions (such as rotations and 
reflections) 

Mobius inversion 
(§2-7.1) 

problems that involve counting certain types of circular 
permutations 
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2.2 BASIC COUNTING TECHNIQUES 

Most counting methods are based directly or indirectly on the fundamental principles 
and techniques presented in this section. The rules of sum, product, and quotient are the 
most basic and are applied more often than any other. The section also includes some 
applications of the pigeonhole principle, a brief introduction to generating functions, 
and several examples illustrating the use of tree diagrams and Venn diagrams. 


2.2.1 RULES OF SUM, PRODUCT, AND QUOTIENT 
Definitions: 

The rule of sum states that when there are m cases such that the itli case has rq 
options, for i = 1 , ... ,m, and no two of the cases have any options in common, the total 
number of options is ni + 112 + ■ ■ ■ + n m . 

The rule of product states that when a procedure can be broken down into m steps, 
such that there are n\ options for step 1, and such that after the completion of step i— 1 
(i = 2, . . . , m) there are rii options for step i, the number of ways of performing the 
procedure is n\ri 2 ■ ■ ■ n m . 

The rule of quotient states that when a set S is partitioned into equal-sized subsets 
of m elements each, there are ^ subsets. 

An m-permutation of a set S with n elements is a nonrepeating ordered selection 
of m elements of S, that is, a sequence of m distinct elements of S. An n-permutation 
is simply called a permutation of S. 

Facts: 

1. The rule of sum can be stated in set-theoretic terms: if sets Si,... , S rn are finite 

and pairwise disjoint, then |Si U S% U • • • U «S m | = \Si\ + IS^I H + |S m |. 

2. The rule of product can be stated in set-theoretic terms: if sets Si,..., S m are finite, 

then \Si x S2 x ■ ■ ■ x S m \ = |Si| ■ |S 2 | |S m |. 

3. The rule of quotient can be stated in terms of the equivalence classes of an equiv- 
alence relation on a finite set S: if every class has m elements, then there are \S\/m 
equivalence classes. 

4. Venn diagrams (§1.2.2) are often used as an aid in counting the elements of a sub- 
set, as an auxiliary to the rule of sum. This generalizes to the principle of inclu- 
sion/exclusion (§2.3). 

5. Counting problems can often be solved by using a combination of counting methods, 
such as the rule of sum and the rule of product. 

Examples: 

1. Counting bit strings: There are 2" bit strings of length n, since such a bit string 
consists of n bits, each of which is either 0 or 1. 

2. Counting bit strings with restrictions: There are 2 n ~ 2 bit strings of length n (n > 2) 
that begin with two Is, since forming such a bit string consists of filling in n—2 positions 
with Os or Is. 
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3. Counting palindromes: A palindrome is a string of symbols that is unchanged if 
the symbols are written in reverse order, such as rpnbnpr or 10011001. There are fc^" - / 2 ! 
palindromes of length n where the symbols are chosen from a set of k symbols. 

4. Counting the number of variable names: Determine the number of variable names, 
subject to the following rules: a variable name has four or fewer characters, the first 
character is a letter, the second and third are letters or digits, and the fourth must 
be X or Y or Z. Partition the names into four sets, Si, S 2 , S 3 , S 4 , containing names of 
length 1, 2, 3, and 4 respectively. Then |Si| = 26, IS2I = 26 x 36, IS3I = 26 x 36 2 , and 

| S4 1 = 26x36 2 x 3. Therefore the total number of names equals | Si | + 1 S2 1 + 1 £3 1 + 1 S4 1 = 
135,746. 

5. Counting functions: There are n m functions from a set A = {ai, . . . , a m } to a set 
B = { 61 , . . . , b n }. (Construct each function / : A — > B by an ?n-step process, where 
step i is to select the value /(a,;).) 

6. Counting one-to-one functions: There are n(n— 1) . . . (n— m+1) one-to-one functions 
from A = {ai, . . . , a m } to B = {bi , . . . , b n }. If values /(ai), . . . , /(aj_i) have already 
been selected in set B during the first i — 1 steps, then there are n — i + 1 possible values 
remaining for /(a© 

7. Counting permutations: There are n(n— 1) . . . (n— m+1) = ( n ™' m y m-permutations 
of an n-element set. (Each one-to-one function in Example 6 may be viewed as an m- 
permutation of B.) (Permutations are discussed in §2.3.) 

8. Counting circular permutations: There are (n — 1)! ways to seat n people around 
a round table (where rotations are regarded as equivalent, but the clockwise/counter- 
clockwise distinction is maintained). The total number of arrangements is n! and each 
equivalence class contains n configurations. By the rule of quotient, the number of 
arrangements is — = (n — 1)! . 

9. Counting restricted circular permutations: If n women and n men are to be seated 
around a circular table, with no two of the same sex seated next to each other, the 
number of possible arrangements is n(n — l)! 2 . 


2.2.2 TREE DIAGRAMS 

When a counting problem breaks into cases, a tree can be used to make sure that every 
case is counted, and that no case is counted twice. 

Definitions: 

A tree diagram is a line-drawing of a tree, often with its branches and/or nodes 
labeled. The root represents the start of a procedure and the branches at each node 
represent the options for the next step. 

Facts: 

1 . Tree diagrams are commonly used as an important auxiliary to the rules of sum and 
product. 

2. The objective in a tree-counting approach is often one of the following: 

• the number of leaves (endnodes) 

• the number of nodes 

• the sum of the path products. 
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Examples: 

1 . There are 6 possible sequences of wins and losses when the home team (H) plays the 
visiting team (V) in a best 2-out-of-3 playoff. In the following tree diagram each edge 
label indicates whether the home team won or lost the corresponding game, and the 
label at each final node is the outcome of the playoff. The number of different possible 
sequences equals the number of endnodes — 6. 



2. Suppose that an experimental process begins by tossing two identical dice. If the 
dice match, the process continues for a second round; if not, the process stops at one 
round. Thus, an experimental outcome sequence consist of one or two unordered pairs of 
numbers from 1 to 6. The three paths in the following tree represent the three different 
kinds of outcome sequences. The total number of possible outcomes is the sum of the 
path products 6 2 + 6 • 15 + 15 = 141. 



2.2.3 PIGEONHOLE PRINCIPLE 
Definitions: 

The pigeonhole principle ( Dirichlet drawer principle ) states that if n + 1 objects 
(pigeons) are placed into n boxes (pigeonholes), then some box contains more than one 
object. (Peter Gustav Lejeune Dirichlet, 1805-1859) 

The generalized pigeonhole principle states that if m objects are placed into k 
boxes, then some box contains at least [ objects. 

The set-theoretic form of the pigeonhole principle states that if /: S — + T where S 
and T are finite and any two of the following conditions hold, then so does the third: 

• / is one-to-one 

• f is onto 

• \S\ = |T|. 
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Examples: 

1. Among any group of eight people, at least two were born on the same day of the 
week. This follows since there are seven pigeonholes (the seven days of the week) and 
more than seven pigeons (the eight people). 

2. Among any group of 25 people, at least four were born on the same day of the week. 
This follows from the generalized pigeonhole principle with to = 25 and k = 7, yielding 

Ifl =-[?!= 4- 

3. Suppose that a dresser drawer contains many black socks and blue socks. If choosing 
in total darkness, a person must grab at least three socks to be absolutely certain of 
having a pair of the same color. The two colors are pigeonholes; the pigeonhole principle 
says that three socks (the pigeons) are enough. 

4. What is the minimum number of points whose placement in the interior of a 2 x 2 
square guarantees that at least two of them are less than \/2 units apart? Four points 
are not enough, since they could be placed near the respective corners of the 2x2 
square. To see that five is enough, partition the 2x2 square into four lxl squares. 
By the pigeonhole principle, one of these lxl squares must contain at least two of the 
points, and these two must be less than \/2 units apart. 

5. In any set of n + 1 positive integers, each less than or equal to 2 n, there are at least 
two such that one is a multiple of the other. To see this, express each of the n + 1 
numbers in the form 2 k ■ q , where q is odd. Since there are only n possible odd values 
for q between 1 and 2 n, at least two of the n + 1 numbers must have the same q, and 
the result follows. 

6. Let B i and B 2 be any two bit strings, each consisting of five ones and five zeros. 
Then there is a cyclic shift of bit string B 2 so that the resulting string, B 2 , matches Bi 
in at least five of its positions. For example, if B\ = 1010101010 and B 2 = 0001110101, 
then B 2 = 1000111010 satisfies the condition. Observe that there are 10 possible cyclic 
shifts of bit string B 2 . For i = 1, . . . , 10, the itli bit of exactly five of these strings will 
match the itli bit of B\ . Thus, there is a total of 50 bitmatches over the set of 10 cyclic 
shifts. The generalized pigeonhole principle implies that there is at least one cyclic shift 
having |"f§] = 5 matching bits. 

7. Every sequence of n 2 + 1 distinct real numbers must have an increasing or decreasing 
subsequence of length n + 1. Given a sequence ai, , a n a +1 , for each tij let dj and ij 
be the lengths of the longest decreasing and increasing subsequences beginning with ay. 
This gives a sequence of n 2 + 1 ordered pairs ( dj,ij ). If there were no increasing or 
decreasing subsequence of length n + 1, then there are only n 2 possible ordered pairs 
( dj,ij ), since 1 < dj < n and 1 < ij < n. By the pigeonhole principle, at least two 
ordered pairs must be identical. Hence there are p and q such that d p = d q and i p = i q . 
If a p < a q , then the sequence a p followed by the increasing subsequence starting at a q 
gives an increasing subsequence of length greater than i q — a contradiction. A similar 
contradiction on the choice of d p follows if a q < a p . Hence a decreasing or increasing 
subsequence of length n + 1 must exist. 


2.2.4 SOLVING COUNTING PROBLEMS USING RECURRENCE RELATIONS 

Certain types of counting problems can be solved by modeling the problem using a 
recurrence relation (§3.3) and then working with the recurrence relation. 
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Facts: 


1 . The following general procedure is used for solving a counting problem using a 
recurrence relation: 

• let a n be the solution of the counting problem for the parameter n; 

• determine a recurrence relation for a n , together with the appropriate number of 

initial conditions; 

• find the particular value of the sequence that solves the original counting problem 

by repeated use of the recurrence relation or by finding an explicit formula for 
a n and evaluating it at n. 

2 . There are many techniques for solving recurrence relations which may be useful in 
the solution of counting problems. Section 3.3 provides general material on recurrence 
relations and contains many examples illustrating how counting problems are solved 
using recurrence relations. 

Examples: 

1 . Tower of Hanoi: The Tower of Hanoi puzzle consists of three pegs mounted on a 
board and n disks of different sizes. Initially the disks are on the first peg in order of 
decreasing size. See the following figure, using four disks. The rules allow disks to be 
moved one at a time from one peg to another, with no disk ever placed atop a smaller 
one. The goal of the puzzle is to move the tower of disks to the second peg, with the 
largest on the bottom. How many moves are needed to solve this puzzle for 64 disks? 

Let a n be the minimum number of moves to solve the Tower of Hanoi puzzle with n 
disks. Transferring the n— 1 smallest disks from peg 1 to peg 3 requires a„_i moves. One 
move is required to transfer the largest disk to peg 2, and transferring the n— 1 disks now 
on peg 3 to peg 2, placing them atop the largest disk requires a n -i moves. Hence, the 
puzzle with n disks can be solved using 2a n _i + 1 moves. The puzzle for n disks cannot 
be solved in fewer steps, since then the puzzle with ?z— 1 disks could be solved using fewer 
than a n -\ moves. Hence a n = 2a n _i + l. The initial condition is ci\ = 1. Iterating shows 
that a n = 2a n -\ + 1 = 2 2 a„_ 2 + 2 + l = ••• = 2”- 1 ai + 2 n " 2 + • • • + 2 2 + 2 + 1 = 2 n -l. 
Hence, 2 64 — 1 moves are required to solve this problem for 64 disks. (§3.3.3 Example 3 
and §3.3.4 Example 1 provide alternative methods for solving this recurrence relation.) 



2 . Reve’s puzzle: The Reve’s puzzle is the variation of the Tower of Hanoi puzzle that 
follows the same rules as the Tower of Hanoi puzzle, but uses four pegs. 

The minimum number of moves needed to solve the Reve’s puzzle for n disks is not 
known, but it is conjectured that this number is R{n) = i *2 l_1 — (MMdl — n)2 fc_1 
where k is the smallest integer such that n < hiJs+H 

The following recursive algorithm, the Frame-Stewart algorithm, gives a method 
for solving the Reve’s puzzle by moving n disks from peg 1 to peg 4 in R{n) moves. If 
n = 1, move the single disk from peg 1 to peg 4. If n > 1: recursively move the n — k 
smallest disks from peg 1 to peg 2 using the Frame-Stewart algorithm; then move the 
k largest disks from peg 1 to peg 4 using the 3-peg algorithm from Example 1 on pegs 
1, 3, and 4; and finally recursively move the n — k smallest disks from peg 2 to peg 4 
using the Frame-Stewart algorithm. 
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3. How many strings of 4 decimal digits contain an even number of Os? Let a„ be the 
number of strings of n decimal digits that contain an even number of Os. To obtain such 
a string: (1) append a nonzero digit to a string of n — 1 decimal digits that has an even 
number of Os, which can be done in 9a n _i ways; or (2) append a 0 to a string of n — 1 
decimal digits that has an odd number of Os, which can be done in 10" -1 — a n -\ ways. 
Hence a n = 9a„_i + (10 n_1 — a„_ i) = 8a n _i + 10" -1 . The initial condition is a\ = 9. 
It follows that ci2 = 8ai + 10 = 82, <23 = 802 + 100 = 756, and 04 = 803 + 1,000 = 7,048. 


2.2.5 SOLVING COUNTING PROBLEMS USING GENERATING FUNCTIONS 

Some counting problems can be solved by finding a closed form for the function that 
represents the problem and then manipulating the closed form to find the relevant 
coefficient. 

Facts: 

1. Use the following procedure for solving a counting problem by using a generating 
function: 

• let a n be the solution of the counting problem for the parameter n; 

• find a closed form for the generating function f(x) that has a n as the coefficient 

of x n in its power series; 

• solve the counting problem by computing a n by expanding the closed form and 

examining the coefficient of x n . 

2. Generating functions can be used to solve counting problems that reduce to finding 
the number of solutions to an equation of the form x± + X 2 + • • • + x n = k, where k is a 
positive integer and the x^s are integers subject to constraints. 

3. There are many techniques for manipulating generating functions (§3.2, §3.3.5) 
which may be useful in the solution of counting problems. Section 3.2 contains ex- 
amples of counting problems solved using generating functions. 

Examples: 

1. How many ways are there to distribute eight identical cookies to three children if 
each child receives at least two and no more than four cookies. Let c n be the number 
of ways to distribute n identical cookies in this way. Then c n is the coefficient of x n 
in ( x 2 + x 3 + x 4 ) 3 , since a distribution of n cookies to the three children is equivalent 
to a solution of x\ + x 2 + £3 = 8 with 2 < Xi < 4 for i = 1,2,3. Expanding this 
product shows that eg, the coefficient of xg, is 6. Hence there are 6 ways to distribute 
the cookies. 

2. An urn contains colored balls, where each ball is either red, blue, or black, there are 
at least ten balls of each color, and balls of the same color are indistinguishable. Find 
the number of ways to select ten balls from the urn, so that an odd number of red balls, 
an even number of blue balls, and at least five black balls are selected. If xi, X 2 , and 
Xg denote the number of red balls, blue balls, and black balls selected, respectively, the 
answer is provided by the number of nonnegative integer solutions of x\ + X 2 + xg = 10 
with x\ odd, X 2 even, xg > 5. This is the coefficient of x 10 in the generating function 
f(x) = (a: + x 3 + x 5 + x 7 + x 9 + ■ ■ -)(1 + x 2 + x 4 + x 6 + x 8 + x 10 + ■ ■ -)(x 5 + x 6 + x 7 + 
x 8 + x 9 + a: 10 + •••). Since the coefficient of x 10 in the expansion is 6, there are six ways 
to select the balls as specified. 
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2.3 PERMUTATIONS AND COMBINATIONS 


Permutations count the number of arrangements of objects, and combinations count 
the number of ways to select objects from a set. A permutation coefficient counts the 
number of ways to arrange a set of objects, whereas a combination coefficient counts 
the number of ways to select a subset. 


2.3.1 ORDERED SELECTION: FALLING POWERS 

Falling powers mathematically model the process of selecting k items from a collection 
of n items in circumstances where the ordering of the selection matters and repetition 
is not allowed. 

Definitions: 

An ordered selection of k items from a set S' is a nonrepeating list of k items from S. 

The falling power x- is the product x(x — 1) . . . (x — k + 1) of k decreasing factors 
starting at the real number x. 

The number n- factorial. n\ (n a nonnegative integer), is defined by the rule 0! = 1, 
n! = n[n — 1) . . . 3-2-1 if n > 1. 

A permutation of a list is any rearrangement of the list. 

A permutation of a set of n items is an arrangement of those items into a list. (Often, 
such a list and/or the permutation itself is represented by a string whose entries are in 
the list order.) 

A k-permutation of a set of n items is an ordered selection of k items from that set. 
A fc-permutation can be written as a sequence or a string. 

The permutation coefficient P(n , k) is the number of ways to choose an ordered 
selection of k items from a set of n items; that is, the number of fc-permutations. 

A derangement of a list is a permutation of the entries such that no entry remains 
in the original position. 

Facts: 

1. The falling power x— is analogous to the ordinary power x k , which is the product 
of k constant factors x. The underline in the exponent of the falling power is a reminder 
that consecutive factors drop. 

2 . P {n ,k) = nh=^ w . 

3. For any integer n, n— = n\. 

4. The numbers P(n , k ) = n— are given in Table 1. 

5. A repetition-free list of length n has approximately n\/e derangements. 

Examples: 

1. (4.2)- = 4.2 • 3.2 • 2.2 = 29.568. 

2. Dealing a row of playing cards : Suppose that five cards are to be dealt from a deck 
of 52 cards and placed face up in a row. There are P( 52, 5) = 52- = 52 • 51 • 50 • 49 • 48 = 
311,875,200 ways to do this. 
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Table 1 Permutation coefficients P(n,k) — n-. 



0 

1 

2 

3 

0 

1 




1 

1 

1 



2 

1 

2 

2 


3 

1 

3 

6 

6 

4 

1 

4 

12 

24 

5 

1 

5 

20 

60 

6 

1 

6 

30 

120 

7 

1 

7 

42 

210 

8 

1 

8 

56 

336 

9 

1 

9 

72 

504 

10 

1 

10 

90 

720 


4 5 6 


24 



120 

120 


360 

720 

720 

840 

2,520 

5,040 

1,680 

6,720 

20,160 

3,024 

15,120 

60,480 

5,040 

30,240 

151,200 


7 8 


5,040 

40,320 40,320 

181,440 362,880 

604,800 1,814,400 


9 10 


362,880 

3,628,800 3,628,800 


3 . Placing distinct balls into distinct bins: k differently-colored balls are to be placed 
into n bins (n > k), with at most one ball to a bin. The number of different ways to 
arrange the balls is P(n, k) = n-. (Think of the balls as if they were numbered 1 to k, 
so that placing ball j into a bin corresponds to placing that bin into the jth position of 
the list.) 

4 . Counting ballots: Each voter is asked to identify 3 top choices from 11 candidates 
running for office. A first choice vote is worth 3 points, second choice 2 points, and 
third choice 1 point. Since a completed ballot is an ordered selection in this situation, 
each voter has P(ll, 3) = 11- = 11 • 10 • 9 = 990 distinct ways to cast a vote. 

5 . License plate combinations : The license plates in a state have three letters (from the 
upper-case Roman alphabet of 26 letters) followed by four digits. There are P(26, 3) = 
15,600 ways to select the letters and P(10,4) = 5,040 ways to select the digits. By 
the rule of product there are P(26, 3)P(10, 4) = 15,600 • 5,040 = 78,624,000 acceptable 
strings. 

6. Circular permutations of distinct objects: See Example 8 of §2.2.1. Also see Ex- 
ample 3 of §2.7.1 for problems that allow identical objects. 

7 . Increasing and decreasing subsequences of permutations: Young tableaux (§2.8) 
can be used to find the number of permutations of {1,2, ... ,n} with specified lengths 
of their longest increasing subsequences and longest decreasing subsequences. 


2.3.2 UNORDERED SELECTION: BINOMIAL COEFFICIENTS 

Binomial coefficients mathematically model the process of selecting k items from a 
collection of n items in circumstances where the ordering of the selection does not 
matter, and repetitions are not allowed. 

Definitions: 

An unordered selection of k items from a set S' is a subset of k items from S. 

A k-combination from a set S is an unordered selection of k items. 

The combination coefficient C(n,k) is the number of ^-combinations of n objects. 
The binomial coefficient (?) is the coefficient of x k y n ~ k in the expansion of ( x + y) n . 
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The extended binomial coefficient ( generalized binomial coefficient ) (^) is zero 
whenever k is negative. When n is a negative integer and k a nonnegative integer, its 
value is (-l)*^- 1 )- 

The multicombination coefficient C(n: k lt k 2 , . . . , k m ) , where n = k\ + k 2 + ■ ■ ■ + k m 
denotes the number of ways to partition n items into subsets of sizes ki,k 2 , • • • t km- 

The multinomial coefficient ( fc fe ) is the coefficient of XjX ^ 2 ...x ^ 1 in the 

expansion of ( x± + X 2 + ■ ■ ■ + x m ) n . 


The Gaussian binomial coefficient is defined for nonnegative integers n and k by 


r«i _ <? -1 . ? 

LfcJ q- 1 q ' 2 - 1 


-1 


g” ~ 1 
q 3 - 1 


-1 


for 0 < k < n 


and [q] = 1, where q is a variable. (See also §2.5.1.) 


Facts: 

1. C(»,i)='M = £ = H55 L IJI=( »). 

2. Pascal’s recursion: where n > 0 and k > 0. 

3. Subsets: There are C(n, k) subsets of size k that can be chosen from a set of size n. 

4. The numbers C(n,k ) = ( ", ) are given in Table 2. Sometimes the entries in Table 2 
are arranged into the form called Pascal’s triangle (Table 3), in which each entry is the 
sum of the two numbers diagonally above the number (Pascal’s recursion, Fact 2). 


Table 2 Combination coefficients (binomial coefficients) C (n,k) = (^). 



5. The extended binomial coefficients satisfy Pascal’s recursion. Their definition is 
constructed precisely to achieve this purpose. 

6. C(n: ki,k 2 , k m ) = fcl!fc2 7 ! fcm , = ( fel k2 n km ) ■ The number of strings of length n 
with ki objects of type i (i = 1, 2, . . . , m) is kl \ k ^' ■ 

7 . C(n, k) = C[n: k, n — k) = C(n,n — k). That is, the number of unordered selections 
of k objects chosen from n objects is equal to the number of unordered selections of 
n — k objects chosen from n objects. 

8. Gaussian binomial coefficient identities: 

• K] = [„"»]; 

• El + L"ih" +1 -‘ = [”;T 
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Table 3 Pascal’s triangle. 


1 

1 1 
12 1 
13 3 1 

1 4 6 4 1 

1 5 10 10 5 1 

1 6 15 20 15 6 1 

1 7 21 35 35 21 7 1 


1 8 28 56 70 56 28 8 1 

1 9 36 84 126 126 84 36 9 1 

1 10 45 120 210 252 210 120 45 10 1 


9. (1 + a:)(l + qx)(l + q 2 x) . . . (1 + q n x x) = Y,k=o [fe] 1 ^ 2 x k - 

10 . lim,— i [J] = (;). 

11 . []*] = ao + a±q + a-iq 2 + • • • + ak(n-k)q k ^ n ~ k ^ where each a* is an integer and 

^k(n-k) _ / n\ 

Z^i=0 — \k)- 

Examples: 

1. Subsets: A set with 20 elements has (7(20,4) subsets with four elements. The total 
number of subsets of a set with 20 elements is equal to (7(20, 0)+<7(20, 1)+- • -+(7(20, 20), 
which is equal to 2 20 . (See §2.3.4.) 

2 . Nondistinct balls into distinct bins: k identically colored balls are to be placed 
into n bins (n > k), at most one ball to a bin. The number of different ways to do this 

k 

is C(n,k) = xf. (This amounts to selecting from the n bins the k bins into which the 
balls are placed.) 

3. Counting ballots: Each voter is asked to identify 3 choices for trustee from 11 
candidates nominated for the position, without specifying any order of preference. Since 
a completed ballot is an unordered selection in this situation, each voter has (7(11, 3) = 
11 3 1 0 ' 1 9 = 165 distinct ways to cast a vote. 

4. Counting bit strings with exactly k Os: There are ())) bit strings of length n with 
exactly k 0 s, since each such bit string is determined by choosing a subset of size k from 
the n positions; 0 s are placed in these k positions, and Is in the remaining positions. 

5. Counting bit strings with at least k Os: There are (£) + ( fe " 1 ) + •••+(”) bit strings 
of length n with at least k 0 s, since each such bit string is determined by choosing a 
subset of size k, k + 1 , . . . ,or n from the n positions; 0 s are placed in these positions, 
and Is in the remaining positions. 

6 . Counting bit strings with equal numbers of 0s and Is: For n even, there are ( n " 2 ) 
bit strings of length n with equal numbers of 0 s and Is, since each such bit string is 
determined by choosing a subset of size ^ from the n positions; 0 s are placed in these 
positions, and Is in the remaining positions. 

7. Counting strings with repeated letters: The word “MISSISSIPPI” has eleven letters, 

with “I” and “S” appearing four times each, “P” appearing twice, and “M” once. There 
are (7(11: 4, 4, 2, 1) = = 34,650 possible different strings obtainable by permuting 

the letters. This counting problem is equivalent to partitioning 11 items into subsets of 
sizes 4, 4, 2, 1. 
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8. Counting circular strings with repeated letters: See §2.7.1. 

9. Counting paths: The number of paths in the plane from (0,0) to a point (m,n) 

(to, n > 0) that move one unit upward or one unit to the right at each step is ■ 

Using U for “up” and R for “right”, each path can be described by a string of to Rs 
and n Us. 


10. Playoff series: In a series of playoff games, such as the World Series or Stanley Cup 
finals, the winner is the first team to win more that half the maximum number of games 
possible, n (odd). The winner must win games. The number of possible win-loss 
sequences of such a series is 2 C(n, z ^)- For example, in the World Series between 
teams A and B , any string of length 7 with exactly 4 As represents a winning sequence 
for A. (The string AABABBA means that A won a seven-game series by winning the 
first, second, fourth, and seventh games; the string AAAABBB means that A won the 
series by winning the first four games.) There are C{ 7,4) ways for A to win the World 
Series, and (7(7,4) ways for B to win the World Series. 


11. Dealing a hand of playing cards: A hand of five cards (where order does not 
matter) can be dealt from a deck of 52 cards in (7(52,5) = 52 , = 2,598,960 ways. 

12. Poker hands: Table 4 contains the number of combinations of five cards that form 
various poker hands (where an ace can be high or low): 


13. Counting partial functions: There are (Jj) + (J)n + (^n 2 + • • • + (£ )n k partial 
functions f:A—*B where \A\ = k and \B\ = n. Each partial function is determined by 
choosing a domain of definition for the function, which can be done, for each j = 0, ... ,n, 
in Q) ways. Once a domain of definition is determined, there are vP ways to define a 
function on that set. (The sum can be simplified to (n + l) fc .) 


14. 


= £=± = i 

9-1 


15. [®] = T=T-M = M-^ = (<7 4 + ‘7 2 + l)(9 4 + 9 3 + 9 2 + 9+l) = l + 9 + 2<Z 2 + 
2 q 3 + 3 q 4 + 2 q 5 + 2 q 6 + q 7 + q a . The sum of these coefficients is 15 = (®), as Fact 11 
predicts. 


16. A particle moves in the plane from (0,0) to (n — k, k) by moving one unit at a 
time in either the positive x or positive y direction. The number of such paths where 
the area bounded by the path, the s-axis, and the vertical line x = n — k is * units is 
equal to a^, where at is the coefficient of q l in the expansion of the Gaussian binomial 
coefficient [£] in Fact 11. 


2.3.3 SELECTION WITH REPETITION 

Some problems concerning counting the number of ways to select k objects from a set 
of n objects permit choices of objects to be repeated. Some of these situations are also 
modeled by binomial coefficients. 

Definitions: 

An ordered selection with replacement is an ordered selection in which each object 
in the selection set can be chosen arbitrarily often. 

An ordered selection with specified replacement fixes the number of times each 
object is to be chosen. 

An unordered selection with replacement is a selection in which each object in 
the selection set can be chosen arbitrarily often. 
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Table 4 Number of poker hands. 


type of hand 

formula 

explanation 

royal flush (ace, king, queen, 

A 

4 choices for a suit, and 1 royal 

jack, 10 in same suit) 


flush in each suit 

straight flush (5 cards of 5 


4 choices for a suit, and in each suit 

consecutive ranks, all in 1 

(i) 9 

there are 9 ways to get 5 cards in 

suit, but not a royal flush) 

a row 

four of a kind (4 cards in 1 
rank and a fifth card) 

Ci) Ci) 

13 choices for a rank, only 1 way 
to select the 4 cards in that rank, 

and 48 ways to select a fifth card 



13 ways to select a rank for the 3- 

full house (3 cards of 1 rank, 

2 of another rank) 

13©12$ 

of-a-kind and ( 4 ) ways to choose 

3 of this rank; 12 ways to select a 
rank for the pair and ( 4 ) ways to 
get a pair of this rank 

flush (5 cards in 1 suit, but 


4 ways to select suit, ( 13 ) ways to 

neither royal nor straight 

4( 1 5 3 )-4-10 

choose 5 cards in that suit; sub- 

flush) 

tract royal and straight flushes 

straight (5 cards in 5 consec- 
utive ranks, but not all of 
the same suit) 

10-4 5 — 4-10 

10 ways to choose 5 ranks in a row 
and 4 ways to choose a card from 
each rank; then subtract royal 
and straight flushes 



13 ways to select 1 rank, ( 4 ) ways 

three of a kind (3 cards of 1 


to choose 3 cards of that rank; 

rank, and 2 cards of 2 dif- 

13 QC^ 2 

( 2 ) ways to pick 2 other ranks and 

ferent ranks) 


4 2 ways to pick a card of each of 
those 2 ranks 

two pairs (2 cards in each of 

2 different ranks, and a fifth 
card of a third rank) 

COQQ 44 

( 13 ) ways to select 2 ranks and ( 4 ) 
ways to choose 2 cards in each of 
these ranks, and ( 4 1 4 ) way to pick 
a nonmatching fifth card 



13 ways to select a rank, ( 4 ) ways 

one pair (2 cards in 1 rank, 

13©( 1 3 2 )4 3 

to choose 2 cards in that rank; 

plus 3 cards from 3 other 

(g ) ways to pick 3 other ranks, 

ranks) 

and 4 3 ways to pick 1 card from 
each of those ranks 



The permutation-with-replacement coefficient P R (n, k ) is the number of ways to 
choose a possibly repeating list of k items from a set of n items. 

The combination-with-replacement coefficient C R (n,k) is the number of ways to 
choose a multiset of k items from a set of n items. 
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Facts: 

1. An ordered selection with replacement can be thought of as obtaining an ordered 
list of names, obtained by selecting an object from a set, writing its name, placing it 
back in the set, and repeating the process. 

2. The number of ways to make an ordered selection with replacement of fc items from n 
distinct items (with arbitrary repetition) is n k . Thus P R (n,k) = n k . 

3. The number of ways to make an ordered selection of n items from a set of q distinct 

items, with exactly fc* selections of object i, is fc 

4. An unordered selection with replacement can be thought of as obtaining a collection 
of names, obtained by selecting an object from a set, writing its name, placing it back 
in the set, and repeating the process. The resulting collection is a multiset (§1.2.1). 

5. The number of ways to make an unordered selection with replacement of k items 
from a set of n items is C(n + k — 1, k). Thus C R (n, k) = C(n + k — 1, fc). 

Combinatorial interpretation: It is sufficient to show that the fc-multisets that can be 
chosen from a set of n items are in one-to-one correspondence with the bit strings of 
length (n + k — 1) with fc ones. To indicate that kj copies of item j are selected, 
for j = 1, . . . , n, write a string of fci ones, then a “0”, then a string of fc ’2 ones, then 
another “0”, then a string of fc 3 ones, then another “0”, and so on, until after the 
string of fc„_i ones and the last “0”, there appears the final string of fc„ ones. The 
resulting bit string has length n + k — 1 (since it has fc ones and n — 1 zeros). Every 
such bit string describes a possible selection. Thus the number of possible selections is 
C(n + k — 1, fc) = C(n + k — 1, n — 1). 

6. Integer solutions to the equation x± + X 2 + ■ ■ ■ + x n = k: 

• The number of nonnegative integer solutions is C(n+k—l, fc) = C(n+k — 1, n— 1). 

[In the combinatorial argument of Fact 5, there are n strings of ones. The first 
string of ones can be regarded as the value for xi, the second string of ones as 
the value for X 2 , etc.] 

• The number of positive integer solutions is C{k — 1, n — 1). 

• The number of nonnegative integer solutions where Xi > at for i = 1 ,n is 

C(n + k — 1 — (ai + • • • + a n ),n — 1) (if oi + • • • + a„ < fc). [Let Xi = yt + at 
for each i, yielding the equation j/i + 2/2 + • • • + y n = k — (ai + • • • + a n ) to be 
solved in nonnegative integers.] 

• The number of nonnegative integer solutions where Xi < ai for i = 1, . . . , n can 

be obtained using the inclusion/exclusion principle. See §2.4.2. 

Examples: 

1. Distinct balls into distinct bins: fc differently colored balls are to be placed into 
n bins, with arbitrarily many balls to a bin. The number of different ways to do this 
is n k . (Apply the rule of product to the number of possible bin choices for each ball.) 

2. Binary strings: The number of sequences (bit strings) of length n that can be 
constructed from the symbol set {0, 1} is 2™. 

3. Colored balls into distinct bins with colors repeated: k balls are colored so that fcq 
balls have color 1, fc ’2 have color 2, . . . , and k q have color q. The number of ways these fc 
balls can be placed into n distinct bins (n > fc), at most one per bin, is 

Note: This is more general than Fact 2, since n can exceed the sum of all the fc,s. If n 
equals this sum, then P(n,n) = n! and the two formulas agree. 
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4. When three dice are rolled, the “outcome” is the number of times each of the num- 
bers 1 to 6 appears. For instance, two 3s and a 5 is an outcome. The number of different 
possible outcomes is C( 6 + 3 — 1, 3) = ( 3 ) =56. 

5. Nondistinct balls into distinct bins with multiple balls per bin allowed : The number 
of ways that k identical balls can be placed into n distinct bins, with any number of 
balls allowed in each bin, is C(n + k — 1, k). 

6. Nondistinct balls into distinct bins with no bin allowed to be empty : The number 
of ways that k identical balls can be placed into n distinct bins, with any number of 
balls allowed in each bin and no bin allowed to remain empty, is C(k — l,n — 1). 

7. How many ways are there to choose one dozen donuts when there are 7 different 
kinds of donuts, with at least 12 of each type available? Order is not important, so a 
multiset of size 12 is being constructed from 7 distinct types. Accordingly, there are 
C( 7+ 12 — 1, 12) = 18,564 ways to choose the dozen donuts. 

8. The number of nonnegative integer solutions to the equation x\ + X 2 + • • • + x-j = 12 
is C( 7 +12 — 1, 12), since this is a rephrasing of Example 7. 

9. The number of nonnegative integer solutions to X\+X 2 + - • -+£5 = 36, where x\ > 4, 
X 3 = 11 and X 4 > 7 is (7(17, 3). [It is easiest to think of purchasing 36 donuts, where at 
least 4 of type 1, exactly 11 of type 3, and at least 7 of type 4 must be purchased. Begin 
with an empty bag, and put in 4 of type 1, 11 of type 3, and 7 of type 4. This leaves 14 
donuts to be chosen, and they must be of types 1, 2, 4, or 5, which is equivalent to 
finding the number of nonnegative integer solutions to x\ + X 2 + X 4 + X 5 = 14.] 


2.3.4 BINOMIAL COEFFICIENT IDENTITIES 
Facts: 

1. Table 5 lists some identities involving binomial coefficients. 

2. Combinatorial identities, such as those in Table 5, can be proved using either al- 
gebraic proofs using techniques such as substitution, differentiation, or the principle of 
mathematical induction (see Facts 4 and 5); they can also be proved by using combina- 
torial proofs. (See Fact 3.) 

3. The following give combinatorial interpretations of some of the identities involving 
binomial coefficients in Table 5. 

• Symmetry : In choosing a subset of k items from a set of n items, the number 

of ways to select which k items to include must equal the number of ways to 
select which n — k items to exclude. 

• Pascal's recursion: In choosing k objects from a list of n distinct objects, the 

number of ways that include the last object is (^Z 3 ), and the number of ways 
that exclude the last object is ("jj 1 )- Their sum is then the total number of 
ways to choose k objects from a set of n, namely (™). 

• Binomial theorem: The coefficient of x k y n ~ k in the expansion (x + y) n = (x + 

y) {x + y) . . . {x + y) equals the number of ways to choose k factors from among 
the n factors ( x + y) in which x contributes to the resultant term. 

• Counting all subsets: Summing the numbers of subsets of all possible sizes yields 

the total number of different possible subsets. 
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Table 5 Binomial coefficient identities. 


Factorial expansion 

Symmetry 

Monotonicity 

Pascal’s identity 

Binomial theorem 

Counting all subsets 

Even and odd subsets 

Sum of squares 

Square of row sums 

A bsorption /extraction 

Trinomial revision 

Parallel summation 

Diagonal summation 

Vandermonde convolution 

Diagonal sums in Pascal’s 
triangle (§2.3.2) 

Other Common Identities 


(fc) — k!(?T-fc)P & — 0>1>2 
(fc) = (n-fc)> k = 0) 1) 2> • • • ) n 

O <(")<-< C" m ). ">o 

(x + y) n = tk)x k y n ~ k . n > o 

ELo ffl = 2”. n > 0 
ELo(-i)*C) = o. mo 
EL o O 2 = a, n > 0 
[e;.o©] 2 = e£o( 2 ;), «>0 
C) = ja:!). mo 

= ° <*<-<» 

Er. 0 (T) = (” + r‘). "*.”>» 
e;;„” r") = c«). xi>m>o 
EU(D(,E) = rr). >o 

Ei”i 2J ("fc 2 ) = F«+i (Fibonacci numbers), n > 0 

ELo*G) ="2"-', mo 
E t.o fc2 ffl=”(»+l)2"- 2 , n> 0 
ELahVkQ = 0, » > 0 

ELo Hi = s bbFf 1 . mo 
E;. a(-l)‘(S = sir. » > 0 
ELi(-i)‘- ,( ? = i + ^ + i + --- + i. ">o 
E«(;)P = (.-,), »>o 

Em Cfc”) Ufc) = C;). -W > 0. n > p + m 


• Sum of squares: Choose a committee of size n from a group of n men and n 

women. The left side, rewritten as (£)( n ” fe )> describes the process of selecting 
committees according to the number of men, fc, and the number of women, 
n — fc, on the committee. The right side gives the total number of committees 
possible. 

• Absorption/extraction: From a group of n people, choose a committee of size fc 

and a person on the committee to be its chairperson. Equivalently, first select a 
chairperson from the entire group, and then select the remaining fc— 1 committee 
members from the remaining n — 1 people. 
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• Trinomial revision: The left side describes the process of choosing a committee 

of size m from n people and then a subcommittee of size k. The right side 
describes the process where the subcommittee of size k is first chosen from 
the n people and then the remaining m — k members of the committee are 
selected from the remaining n — k people. 

• Vandermonde convolution: Given m men and n women, form committees of 

size r. The summands give the numbers of committees broken down by number 
of men, k , and number of women, r — k, on the committee; the right side gives 
the total number of committees. 

4. The formula for counting all subsets can be obtained from the binomial theorem by 
substituting 1 for x and 1 for y. 

5. The formula for even and odd subsets can be obtained from the binomial theorem 
by substituting 1 for x and —1 for y. 

6. A set A of size n has 2 n ~ 1 subsets with an even number of elements and 2 n ~ 1 subsets 
with an odd number of elements. (The even and odd subsets identity in Table 5 shows 
that J2 (fc) for & even is equal to ik) f° r ^ °dd. Since the total number of subsets 
is 2", each side must equal 2 n ~ 1 .) 


2.3.5 GENERATING PERMUTATIONS AND COMBINATIONS 

There are various systematic ways to generate permutations and combinations of the 
set {1, . . . , n}. 

Definitions: 

A list of strings from an ordered set is in lexicographic order if the strings are sorted 
as they would appear in a dictionary. 

If the elements in the strings are ordered by a relation <, string aici 2 . . . a m precedes 
bib ’2 ■ ■ ■ b n if any of the following happens: ai < bp there is a positive integer k such 
that ai = bi, . . . , ak = bk and ak+i < bk+ 1 ; or to < n and ai = bp a m = b m . 

Algorithms: 

Algorithms 1, 2, and 5 give ways to generate all permutations, fc-permutations, and k- 
combinations of {1, 2, . . . , n} in lexicographic order. Algorithms 3, 4, and 6 give ways to 
randomly generate a permutation, fc-permutation, and fc-combination of {1,2, . . . ,n}. 


Algorithm 1: Generate the permutations of {1, , n} in lexicographic 

order. 

0 , 10,2 . . . o n := 12 ... n, 
while aici 2 . . .a n ^ n n— 1 ... 1 

m := the rightmost location such that a m is followed by a larger number 
a[a 2 . . . a' m _ 1 = a \02 ■ ■ ■ a m -i {retain everything to the left of a m } 
a' m := the smallest number larger than a m to the right of a m 
a 'm+i a 'm+2 • • ■ o'n : = everything else, in ascending order 
aio 2 . . . a n := a'^ . . . a' n 
Output 0\02 ■■ - a n 
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Algorithm 2: Generate the fc-permutations of {1, , n} in lexicographic 

order. 

aid 2 . . . a* := 1 2 . . . fc {fc a given positive integer less than or equal to n} 
while aid 2 . . . a* ^ n n— 1 . . . n — (k — 1) 

to := the rightmost location such that a m is followed by a larger number 
a\ a' 2 . . . a' m _ i := aia 2 . . . a m _i {retain everything to the left of a m } 
a! m := the smallest number larger than a m to the right of a m 
a'm+i a 'm + 2 ■ ■ - a 'k ;= everything else, in ascending order 
aia 2 . . . a k := a[a ' 2 . . . a' k 
output aia 2 . . . afc 


Algorithm 3: Generate a random permutation of {1, , to}. 

aia 2 . . . a n := 1 2 . . . n 
for i := 0 to n — 2 

interchange a„_j and a r („_j) { r(k) a randomly chosen integer in {1, , k}} 

output ai . . . a n {a randomly chosen permutation of { 1 , . . . , n}} 


Algorithm 4: Generate a random fc-permutation of {1, , to}. 

0102 . . . a n '■= a random permutation of {1, . . . , n} {obtained from Algorithm 3} 
output Oi . . . Ofe {a randomly chosen fc-permutation of {1, ... , n}} 


Algorithm5: Generate fc-combinations of {l,...,n} in lexicographic or- 
der. 

aia 2 . . . a k := 1 2 ... fc {first combination in lexicographic order} 
while aici 2 . . . at / n—k+1 n—k+2 . . . n 

to := the rightmost location among 1, . . . , fc such that a number larger than 
a m but smaller than n is not in the combination 
a 1 a 2 ■ ■ ■ a m _ 1 := aia 2 . . . a m _i {retain everything to the left of a m } 
a := a m + 1 {increase a m by 1} 

a m+ i a m+2 ■ ■ ■ a k := a m + 2 a m +3 . . . a m +k— m+1 {continue consecutively} 
aia 2 . . . Ofe := a 1 a 2 . . . a k 

output aia 2 ■ ■ ■ a k {the members of each fc-combination are given in ascending 
order} 


Algorithm 6: Generate random fc-combinations of {1, . . . , to}. 

aia 2 . . . a k '■= any fc-permutation of {1, ... , n} generated by Algorithm 4 
output ai <22 . . . a k {ignoring the order in which elements are written, this is 
a random fc-combination} 


Examples: 

1. The lexicographic order for the 3-permutations of {1, 2, 3} is 123, 132, 213, 231, 312, 
321. 

2. The lexicographic order of the C(5,3) = 10 3-combinations of {1,2, 3, 4, 5} is 123, 
124, 125, 134, 135, 145, 234, 235, 245, 345. 
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3. Generating permutations: What permutation follows 3142765 in the lexicographic 
ordering of the permutations of { 1 , . . . , 7}? Step 1 of the while-loop of Algorithm 1 
leads to the fourth digit, namely the digit 2, as the first digit from the right that has 
larger digits following it. Steps 2 and 3 show that the next permutation starts with 3145 
since 5 is the smallest digit greater than 2 and following it. Finally, step 4 yields 2, 6, 
and 7 (in numerical order) as the digits that follow. Thus, the permutation immediately 
following 3142765 is 3145267. 

4. Generating combinations: What 5-combination follows 12478 in the lexicographic 
ordering of 5-combinations of {1, . . . , 8}? Step 1 of the while-loop of Algorithm 2 leads 
to the third digit, namely the digit 4, as the first digit from the right that can be safely 
increased by 1. Step 2 shows that the next permutation starts with 125 since the 3rd 
digit is increased by 1. Finally, step 3 yields 6 and 7 as the following digits (add 1 to 
the newly-listed previous digit until the new selection of k digits is complete). Thus, 
the combination after 12478 is 12567. 


2.4 INCLUSION/EXCLUSION 


The principle of inclusion/exclusion is used to count the elements in a non-disjoint union 
of finite sets. Many counting problems can be solved by applying this principle to a 
well-chosen collection of sets. The techniques involved in this process are best illustrated 
with examples. 


2.4.1 PRINCIPLE OF INCLUSION/EXCLUSION 

The number of elements in the union of two finite sets A and B is |A| + \B\, provided 
that the sets have no element in common. In the general case, however, some elements 
in common to both sets have been included in the sum twice. The sum is adjusted to 
exclude the double-counting of these common elements by subtracting their number: 

\AUB\ = \A\ + \B\-\AnB\. 

A Venn diagram (§1.2.2) for these sets is given in the following figure. 


Similarly, the number of elements in the union of three finite sets is 


\A U B U C\ = \A\ + \B\ + \C\ — \A n B\ — \A fl C\ — \B n C\ + \A n B n C\. 


See the following figure. 


These simple equations generalize to the case of n sets. 
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Facts: 

1. Inclusion/exclusion principle : the number of elements in the union of n finite sets 
Ai,A 2 , . . .,A n is: 


\A 1 UA 2 U---UA n \ 


or, alternatively, 


\A 1 UA 2 U---UA n \ 


E \Ai\ — £ 1-AjnAjl + £ \AiC\Ajf]Ak\ 

l<i<n l<i<j<k<n 

- ■■■ + (-i ) n+1 \A 1 nA 2 n---nA n \ 

n 

£(- i ) fc+1 £ \A zl nA i 2 n---nA. lk \ 

k — 1 l<ii<---<ik<n 


Sometimes the inner sum of the alternative formula is denoted Sk- 

2. The inclusion/exclusion formula for n sets has 2 ra — 1 terms, one for each possible 
nonempty intersection. The coefficient of a term is —1 if the term corresponds to 
intersections of an even number of sets, and +1 otherwise. 

3. The principle is often applied to the complement of a set. Let Af be the subset of 
elements in a universal set U that have property Pi . The number of elements that have 
properties , P, 2 , . . . , Pi k is often written N (Pi 1 Pi 2 . . . Pi k ) and the number of elements 
that have none of these properties is often written N(P' it P' 2 ■ . . P/ ). The number of 
element in U that have none of the properties is: 

N(P[P^ ...P' n ) = \U\- £ N(Pi) + E N{PiPj) - ' ' ' + (-l) n JV(PiP 2 ■ • ■ Pn). 

l<i<n l<i<j<n 


Examples: 

1. Of 70 people surveyed, 37 drink coffee, 23 drink tea, and 25 drink neither. Find 
the number who drink both coffee and tea. Using C to represent the set of coffee 
drinkers and T to represent the set of tea drinkers, the size of C fl T must be found. 
Since \TUC\ = 25, the Venn diagram in part (a) of the following figure shows that 
\C U T\ = 45. According to the inclusion/exclusion principle, 

\Cnr\ = \C\ + \T\ - \C U T\ = 37 + 23 - 45 = 15, 
illustrated in part (b) of the figure. 




2. Suppose that 16 high-school juniors enroll in Algebra, 17 in Biology, and 30 in 
Chemistry; that 5 students enroll in both Algebra and Biology, 4 in both Algebra and 
Chemistry, and 7 in both Biology and Chemistry; that 3 students enroll in all three; 
and that every junior takes at least one of these three subjects. Then the total number 
of students in the junior class is 16 + 17 + 30 — (5 + 4 + 7) + 3 = 50. 
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3. Each of 11 linguists translates at least one of the languages Amharic and Burmese 
into English. The numbers who translate only Amharic or Burmese are both odd primes. 
More linguists translate Burmese than Amharic. How many can translate Amharic? 

Based on experimentation or on an analytic approach, the only possible assignment 
of numbers to regions that fits all these facts leads to 6, as shown in the following figure. 



4. At a party for 28 people, three kinds of pizza were served: anchovy, broccoli, and 
cheese. Everyone ate at least one kind. No two of the seven different possible selections 
of one or more kinds of pizza were eaten by the same number of partygoers. Each of the 
three possible exclusive selections (one kind of pizza only) was eaten by an odd number 
of partygoers, and each of the three possible combinations of two kinds of pizza was 
eaten by an even number of partygoers. If a total of 18 partygoers ate cheese pizza, how 
many ate both anchovy and broccoli? 

The answer is 2. Experimentation or an analytic approach leads to the possible 
assignments of numbers to regions that fit all these facts, shown in the following figure. 


anchovy broccoli 



5. To count the number of ways to select a 5-card hand from a standard 52-carcl deck 
so that the hand contains at least one card from each of the four suits, let Ai, A 2 , A 3 , 
and A 4 be the subsets of 5-card hands that do not contain a club, diamond, heart, or 
spade, respectively. Then 

|Aj| = ( 52 “ 13 ) = ( 3 5 9 ) with (^) choices for i 

| Ai nAj\ = ( 52 g 26 ) = ( 2 5 6 ) with (2) choices for i and j 

\Ai nAjnA k \ = ( 52 “ 39 ) = ( 13 ) with (3) choices for i, j, and k. 

There are ( 52 ) possible 5-card hands, so by complementation and the principle of inclu- 
sion/exclusion, those that contain at least one card from each suit is 

( 5 5 2 )-(t)( 3 5 9 ) + (2)( 2 5 6 )-(3)(5 3 )=685,464. 


2.4.2 APPLYING INCLUSION/EXCLUSION TO COUNTING PROBLEMS 
Definitions: 

A derangement on a set is a permutation that leaves no element fixed. The number 
of derangements on a set of cardinality n is denoted D n . 

A rencontre number D Ut k is the number of permutations on a set of n elements that 
leave exactly k elements fixed. 
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Facts: 

1 . The number of onto functions from an n-element set to a fc-element set (n > k) is 

e;=o(-i) j ‘ (?)(*- i) n - 

(See Example 3.) 

2. The following binomial coefficient identities can all be derived by combinatorial 
arguments using inclusion/exclusion: 

• Er.o(-i)‘G)C:t) = o 

• = cd) 

• ELo(-i)‘0( 2 =^P= 1 ) = (;:;>■ 

3. D n = n!(l — yr + ^ — • • • + (— l)”^j). (See Example 8.) 

4. — ■> e _1 « 0.368 as n — > oo. 

5. D n = nD„_i + (— l) n for n > 1. 

6. D n = (n - l)(D„_i + Z?„_ 2 ) for n > 2. 

7. The following table gives some values of D n : 


n 

D n 

n 

D n 

n 

D n 

n 

D n 

i 

0 

4 

9 

7 

1,854 

10 

1,334,961 

2 

1 

5 

44 

8 

14,833 

11 

14,684,570 

3 

2 

6 

265 

9 

133,496 

12 

176,214,841 


8. D. n o — D n . 

9. D n ,fc = (’O^n-fc 

10. The following table gives some values of D n ^: 


n\ k 

0 

1 

2 

3 

4 

5 

6 

7 

8 9 10 

0 

1 









1 

0 

1 








2 

1 

0 

1 







3 

2 

3 

0 

1 






4 

9 

8 

6 

0 

1 





5 

44 

45 

20 

10 

0 

1 




6 

265 

264 

135 

40 

15 

0 

1 



7 

1,854 

1,855 

924 

315 

70 

21 

0 

1 


8 

14,833 

14,832 

7,420 

2,464 

630 

112 

28 

0 

1 

9 

133,496 

133,497 

66,744 

22,260 

5,544 

1,134 

168 

36 

0 1 

10 

1,334,961 

1,334,960 

667,485 

222,480 

55,650 

11,088 

1,890 

240 

45 0 1 


Examples: 

1. The inclusion/exclusion principle can be used to establish the binomial coefficient 
identity 

m 

o = E(-i)‘ +, es)G). 

fc= i 

Let A,; denote the subset of m-combinations that contain object i. Thus, the &:-fold 
intersection A tl n A^ 2 D • • • D Ai k consists of all the m-combinations that contain all 
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the objects ii,« 2 , • • • ,ik- Since there are ("t) ways to complete an rn, -combination in 
this intersection, it follows that | A^ D A, 2 n - - • D A ik | = (”1 fc ). Since the k objects 
themselves can be specified in (£) ways, it follows that 

E l^i nA l2 n-n A ik \ = flj) , k < m. 

I<ii<i2<---<ifc<n 


Since A\ U A 2 U • • • U A n is the set of all m-combinations selected from 1,2 , ,n that 
contain at least one of the objects 1,2,..., n, it must be the set of all m-combinations. 
2 . Sieve of Eratosthenes : The sieve of Eratosthenes (276-194 BCE) is a method for 
finding all primes less than or equal to a given positive integer n. Begin with the 
list of integers 2 through n, and delete all multiples of the first number in the list, 2, 
but not including 2. The first integer remaining after 2 is 3; delete all multiples of 3, 
not including 3. The first integer remaining after 3 is 5; delete all multiples of 5, not 
including 5. Continue the process. The remaining integers are the primes less than or 
equal to n. (See §4.4.2.) 

The inclusion/exclusion principle can be used to obtain the number of primes less 
than or equal to n. (A number x < n is prime if and only if x has a prime factor less 
than or equal to (vEI-) Let P. t be the property: a number is greater than the zth prime 
and divisible by the zth prime. Then the number of primes less than or equal to n is 
N{P[P! 2 . . . P' k ), where there are k primes less than or equal to [y/n _ |. (§2.3.1, Fact 3.) 

For example, the number of primes less than or equal to 100 is N (P^P^P^P'^) = 


99 - [if j - [*f j - [if j - [if j + Lff + LffJ + LfyJ + [if J + [if J + [ff J - 

[t§%J - im\ L^J - Lt§tJ + IMrl =99- 50 - 33 - 20 - 14+ 16 + 10 + 7 
6 + 4 + 2- 3- 2-1-0 + 0 = 21. 


3 . Number of onto functions: The number of onto functions from an n-element set to 
a fc-element set (n > k) is Ej= 0 ( — 1) J ( k ) (k — j) n . The number of onto functions from 
an n-element set to a fc-element set equals the number of ways that n different objects 
can be distributed among k different boxes with none left empty. Let A, be the subset 
of distributions with box i empty. Then 


\A i \ = (k-l) n 
\Ai n Aj\ = (fc — 2)" 


with (J) choices for i 
with (f choices for i and j 


| A h D Aj 2 n • • • n A ik I = (k - k) n with ([() choices for Zi, i 2 , ■ . . , Zfc- 
The number of distributions that leave no box empty is then ]r^_ 0 (— 1) J ' (^){k — j) n . 

The number of onto functions from an n-element set to a fc-element set for some values 
of n and k ( n > k) are given in the following table. 



1 2 

3 

1 

1 


2 

1 2 


3 

1 6 

6 

4 

1 14 

36 

5 

1 30 

150 

6 

1 62 

540 

7 

1 126 

1806 

8 

1 254 

5796 

9 

1 510 

18,150 


4 

5 

24 


240 

120 

1560 

1800 

8400 

16,800 

40,824 

126,000 

186,480 

834,120 


6 7 


720 

15,120 5,040 

191,520 141,120 

1,905,120 2,328,480 


8 9 


40,320 

1,451,520 362,880 
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4 . There are 584 nonnegative integer solutions to X\ + x 2 + £3 + X4 = 20 where x\ < 8, 
X2 < 10, and X3 < 5. [Let Ay be the set of solutions where x\ > 9, A 2 the set of solutions 
where x 2 > 11, and A 3 the solutions where X 3 > 6. The final answer, obtained using 
the inclusion/exclusion principle and techniques of the Examples of §2.3.3, is equal to 
<7(23, 3) -\A 1 UA 2 UA 3 \ = <7(23, 3) - (<7(14, 3) + <7(12, 3) +<7(17, 3) - <7(3, 3) — <7(8, 3) - 
<7(6, 3) +0) = 584.] 


5. The permutations ( 2 3 ]*) and Q 1 2) are derangements of 1,2,3, 
tions (123)’ (132)’ (321)’ and Q \ 3) are not. 


but the permuta- 


6 . Problems des rencontres: In the probleme des rencontres (matching problem) an 
urn contains balls numbered 1 through n, and they are drawn out one at a time. A 
match occurs if ball i is the All ball drawn. The probability that no matches occur 
when all the balls are drawn is . The problem was studied by Pierre-Remond de 
Montmort (1678-1719) who studied the card game treize, in which matchings of pairs 
of cards were counted when two decks of cards were laid out face-up. 

7 . Probleme des menages: The probleme des menages, first raised by Frangois Lucas 
(1842-1891), requires that n married couples be seated around a circular table so that 
no men are adjacent, no women are adjacent, and no husband and wife are adjacent. 
There are 2n! £" =0 (— l)*(n — «)! ( 2 ™ _I ) 2?v+ ways to seat the people. (There are 2n! 
ways to seat the n women. Regardless of how this is done, by the inclusion/exclusion 
principle there are £” =0 (— l)*(n ~ *)!( 2 " -1 ) 3+17 ways to seat the n men.) 

8. Determining the number D n of derangements of {1, . . . ,n}: Let A,; be the subset 
of permutations that fix object i. The permutations in the subset A\ U A 2 U • • • U A n 
are those that fix at least one object. Then 

| A, | = (n — 1)! with (™) choices for i 

|Aj fl Aj| = (n — 2)! with (™) choices for i and j 


|Aq fl A i2 n • •• n A ik \ = (n — fc)! with (^) choices for ii,i 2 , . . . ,ik 
Complementation and inclusion/exclusion now yield the formula in Fact 3: 

Dn = n! - ELiM ) k+ 1 Q{n-k)\ = n!£” =0 (- 1 )*£. 

As n becomes large, Df- approaches e -1 « 0.368 very rapidly. 

9. Hatcheck problem: The hatchecker at a restaurant neglects to place claim checks 
on n hats. Each of the n customers is given a randomly selected hat upon exiting. What 
is the probability that no one receives the correct hat? 

There are n! possible permutations of the n hats, and there are D n cases in which 
no one gets the correct hat. Thus, by Example 8, the probability is approximately e _1 , 
regardless of the number of diners. 

10. Rook polynomials/ Arrangements of objects where there are restrictions on posi- 
tions in which the objects can be placed: This describes a family of assignment or 
matching problems, such as matching applicants to jobs where some applicants cannot 
be assigned to certain jobs, the probleme des menages, and the probleme des rencontres. 
In terms of matching n applicants to n jobs, set up an n x n “board of possibilities” 
where the rows are labeled by the people and the columns are labeled by the jobs. 
Square (i,j) is a forbidden square if applicant i cannot perform job j ; the remaining 
squares are allowable squares. An allowable arrangement is an arrangement where only 
allowable squares are chosen, with exactly one square chosen in each row and column. 
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These problems can be rephrased in terms of placing rooks on a chessboard: given 
a chessboard with some squares forbidden, find the number of ways of placing rooks on 
the allowable squares of the chessboard so that no rook can capture any other rook. (In 
chess a rook can move any number of squares vertically or horizontally.) For a given 
n x n board B , let A, = the number of ways to place n nontaking rooks on B so that the 
rook in row i is on a forbidden square. The total number of ways to place n nontaking 
rooks on allowable squares is 

n! - \Ai U • • • U An | = n! - n(B)(n - 1)! + r 2 {B){n - 2)! + (-l) n r„(B)0! 

where the coefficients Ti{B ) are the number of ways to place i nontaking rooks on 
forbidden squares of B. 

A rook polynomial for an n x n board B is a polynomial of the form 

R(x, B) = r 0 (B) + ri{B)x + r 2 (B) x 2 H 1- r n {B)x n , 

where tq{B) is defined to be 1. 

The numbers r,;(I?) can sometimes be found more easily by using a combination of the 
following two reduction techniques: 

• R(x , B) = R(x , B i) • R(x, B 2 ), if all forbidden squares of B appear in two disjoint 

sub-boards B i and B 2 (the sub-boards B i and B 2 are disjoint if the row labels 
of B are partitioned into two parts S\ and S 2 , the column labels of B are 
partitioned into two parts Tf and T 2 , and B i is obtained from S\ x Tj and B 2 
is obtained from S 2 x T 2 ). 

• R{x , B ) = xR{x , Bi) + R(x, B 2 ), where there is a square (i,j) of B , Bi is obtained 

from B by removing all squares in row i and all squares in column j, and B 2 is 
obtained from B by making square (i,j) allowable. 

It may be necessary to use these techniques repeatedly to obtain boards that are simple 
enough that the rook polynomial coefficients can be easily found. 

11. Rook polynomials can be used to find the number of derangements of n objects. 
The forbidden squares of the board B are the squares ( i , i). The first reduction technique 
of Example 10 used repeatedly breaks B into Bi, . . . , B n where Bi consists only of 
square ( i,i ). Then 

R(x, B) = R(x, Bi)R(x, B 2 )... R(x, B n ) = (1 + *) . . . (1 + x) = (1 + x) n = E”= 0 (?)**• 
Therefore, the number of derangements is 

n! - [(”)(n - !)! - ®(n - 2)! + • • • + (-1)" +1 (”)0!] = 


2.5 PARTITIONS 


Each way to write a positive integer n as a sum of positive integers is called a partition 
of n. Similarly, each way to decompose a set S into a family of mutually disjoint 
nonempty subsets is called a partition of S. In a cyclic partition of a set, the elements 
of each subset are arranged into cycles, and two cyclic partitions in the same family of 
subsets are distinct if any of the cycle arrangements are different. The main concerns 
are counting the number of essentially different partitions of integers and sets, and with 
counting cyclic partitions of sets. 
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2.5.1 PARTITIONS OF INTEGERS 

A positive integer can be decomposed into a sum of positive integers in various ways, 
taking into account restrictions on the number of parts or on the properties of the parts. 

Definitions: 

A partition of a positive integer n is a representation of n as the sum of positive 
integers. The parts are usually written in nonascending order, but order is ignored. 

A Ferrers diagram of a partition is an array of boxes, nodes, or dots into rows of 
nonincreasing size so that each row represents one part of the partition. 

The conjugate of a partition is the partition obtained by transposing the rows and 
columns of its Ferrers diagram. 

A composition is a partition in which the order of the parts is taken into account. 

A vector partition is a decomposition of an n-tuple of nonnegative integers into a sum 
of nonzero ?r-tuples of nonnegative integers, where order is ignored 

A vector composition is the same as a vector partition, except that order is taken 
into account. 

Facts: 

1. The following table gives the notation for various functions that count partitions: 


function 

type of partitions counted 

P(n ) 

number of partitions of n 

Q(n) 

number of partitions of n into distinct parts 

O(n) 

number of partitions of n into odd parts 

Pm{n) 

number of partitions of n with at most m parts 

q m {n) 

number of partitions of n with no part larger than m 

p(N , M, n) 

number of partitions of n into at most M parts, with each 
part no larger than N 


2. p(m,n,n) = q m (n). 

3 . p(n,m,n) = p m (n). 

4. Pm{n ) = q m (n). 

5 . 0{n) = Q(n). 

6. The number of compositions of n into k parts is = ()(©). 

7 . The number of compositions of n is 2" _1 . 

8. The number of compositions of n using no Is is F n _\ (Fibonacci numbers Fq = 0, 

Fi = l, F 2 = l, F 3 = 2,...). 

9 . The partition function p(n) satisfies these congruences (see [Kn93] for details): 

p(5n + 4) = 0 (mod 5) 
p(7n + 5) = 0 (mod 7) 
p(lln + 6) = 0 (mod 11). 
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10. The partition functions p{n) and p m (n) satisfy these recurrences: 

p(n) — p{n — 1) — p(n — 2) + p(n — 5) + pin — 7) + • • • 

+(— 1 ) k p(n - |(3 k - 1)) + (-1 ) k p(n - |(3 k + 1)) H = 0, n > 0 

Pm(n) = p m {n - m) +p m -i(n). 

11 . The asymptotic behavior of p(n), Q(n), and p m (n) is as follows (see [An84] Chap- 
ters 5 and 6, [HaRal8], or [Kn93] for details): 

p(n) ~ — e 7r v /2n / 3 as n — » oo, 

4n\/3 

Q{n) ~ 4 _ * 1/4 n" 3/4 e 7r V^ as n -> oo, 
n m_1 

Pm{n) ~ — 77 -7j as n — > oo, with m fixed. 

m!(m — 1)! 

12. The following are generating functions for partition functions: 


OO 

E P(n)q n = II (1 + «‘ + 9 i+i 

n>0 i = 1 


00/00 


•) = n E q mi 

i — 1 \m=0 


oo 


n 


i 

1 -q* 


E Q{n)q n 

n> 0 


11(1 + 9*) 




m 

E p m (n) 9 " = n (1 + <i + 9 i+i 

n>0 «=1 


o = n e 


i=l \m=0 


m 


n 


i 

1-9 4 


E p(N,M,n)q n 

n> 0 


AT 


n 

1=1 


(1 '?•’) 


nf = + i M (W) 


Note: Even though these expressions for p(N, M, n ) look like quotients of polynomials 
they are actually just polynomials of degree NM. They are called Gaussian polynomials 
or q-binomial coefficients. (See Chapters 1 and 2 of [An76], Chapter 19 of [HaWr60], or 
[Mal6] for details. Also see §2.3.2.) 

13. The following are additional generating functions for partition functions (see Chap- 
ter 2 of [An76] or Section 8.10 of [GaRa90] for details): 

oo oo n 

E p(")9 n = 1 + E (■!..,)(! 

n—1 n= 1 

oo n 2 

= 1 + E (l-q)2(l_g2)2... (1 _ g n ) 2 

n—1 


oo 


E Qi n )q 


n 


°° q ^ + D/2 

1 ^ A- <•/)( l ,2 ;...(! 

n—1 

oo 

1 + 9+ E 9 n (l + 9)(l + 9 2 )...(l + 9"- 1 ) 

71 = 2 


( \ n i , ^ {l-q m ){l-q rn+1 )...{l-q m+n - 1 ) n 

E Pm(n)q n = 1 + E V 

n= 1 n—1 


14. See [GrKnPa94] for an algorithm for generating partitions. 
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15 . The following table gives some values of p m (n). More extensive tables appear in 
[GuGwMi58] . 


n\ m 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

0 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

0 

1 

1 

1 

1 

1 

1 

1 

1 

1 

1 

2 

0 

1 

2 

2 

2 

2 

2 

2 

2 

2 

2 

3 

0 

1 

2 

3 

3 

3 

3 

3 

3 

3 

3 

4 

0 

1 

3 

4 

5 

5 

5 

5 

5 

5 

5 

5 

0 

1 

3 

5 

6 

7 

7 

7 

7 

7 

7 

6 

0 

1 

4 

7 

9 

10 

11 

11 

11 

11 

11 

7 

0 

1 

4 

8 

11 

13 

14 

15 

15 

15 

15 

8 

0 

1 

5 

10 

15 

18 

20 

21 

22 

22 

22 

9 

0 

1 

5 

12 

18 

23 

26 

28 

29 

30 

30 

10 

0 

1 

6 

14 

23 

30 

35 

38 

40 

41 

42 


16 . The following table gives values for p(n) and Q(n). 


n 

P(n ) 

Q{n) 

n 

Pin) 

Q(n) 

n 

p{n) 

Q(n) 

0 

1 

1 

17 

297 

38 

34 

12,310 

512 

1 

1 

1 

18 

385 

46 

35 

14,883 

585 

2 

2 

1 

19 

490 

54 

36 

17,977 

668 

3 

3 

2 

20 

627 

64 

37 

21,637 

760 

4 

5 

2 

21 

792 

76 

38 

26,015 

864 

5 

7 

3 

22 

1,002 

89 

39 

31,185 

982 

6 

11 

4 

23 

1,255 

104 

40 

37,338 

1,113 

7 

15 

5 

24 

1,575 

122 

41 

44,583 

1,260 

8 

22 

6 

25 

1,958 

142 

42 

53,174 

1,426 

9 

30 

8 

26 

2,436 

165 

43 

63,261 

1,610 

10 

42 

10 

27 

3,010 

192 

44 

75,175 

1,816 

11 

56 

12 

28 

3,718 

222 

45 

89,134 

2,048 

12 

77 

15 

29 

4,565 

256 

46 

105,558 

2,304 

13 

101 

18 

30 

5,604 

296 

47 

124,754 

2,590 

14 

135 

22 

31 

6,842 

340 

48 

147,273 

2,910 

15 

176 

27 

32 

8,349 

390 

49 

173,525 

3,264 

16 

231 

32 

33 

10,143 

448 

50 

204,226 

3,658 


Examples: 

1. The number 4 has five partitions: 

4 3 + 1 2 + 2 2 + 1 + 1 1 + 1 + 1 + 1. 

2 . The number 4 has eight compositions: 

4 1+3 3+1 2+2 2+1+1 1+2+1 1+1+2 l+l+l+l. 

3 . The vector partitions of (2, 1) are: 

( 2 , 1 ) ( 2 , 0 ) + ( 0 , 1 ) ( 1 , 0 ) + ( 1 , 0 ) + ( 0 , 1 ) ( 1 , 0 ) + ( 1 , 1 ) 
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4. The partition 18 = 5 + 4 + 4 + 2 + 1 + 1 + 1 has the Ferrers diagram in part (a) 
of the following figure. Its conjugate is the partition 18 = 7 + 4 + 3 + 3 + 1, with the 
Ferrers diagram in part (b) of the figure. 


• • • 
• • • 


(a) (b) 

5. Identical balls into identical bins: The number of ways that n identical balls can 
be placed into k identical bins, with any number of balls allowed in each bin, is given 
by Pk(n). 

6. Identical balls into identical bins with no bin allowed to be empty: The number 
of ways that n identical balls can be placed into k identical bins (n > k), with any 
number of balls allowed in each bin and no bin allowed to remain empty, is given by 
Pk(n) — pk-i{n). 


• • • 
• • 

• • 


2.5.2 STIRLING COEFFICIENTS 


Definitions: 


A cyclic partition of a set is a partition of the set (into disjoint subsets whose union 
is the entire set) where the elements of each subset are arranged into cycles. Two cyclic 
partitions using the same family of subsets distinct if any of the cycle arrangements are 
different . 


The Stirling cycle number 

nonempty cycles. 


~n~ 

,k. 


is the number of ways to partition n objects into k 


The Stirling number of the Erst kind s(n, k) is the coefficient of x k in the polynomial 
x(x — l)(x — 2) . . . (x — n + 1). Thus, 

n 

s(n, k)x k = x(x — 1) (x — 2) ... (x — n + 1). 

k = o 


The Stirling subset number | j is the number of ways to partition a set of n objects 
into k nonempty subsets. 

The Stirling numbers of the second kind S(n,k ) are defined implicitly by the 
equation 

n 

x n = S(n, k)x(x — l)(x — 2) . . . (x — k + 1). 

fc =0 


The Bell number B n is the number of partitions of a set of n objects. (Eric Temple 
Bell, 1883-1960) 


Facts: 

1. 8(n,k)(-l) n ~ k = [”]. 

2. S(n,k) = {l}. 
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3 . The following table gives Stirling numbers of the first kind, s(n, k ): 



4 . The following table gives Stirling subset numbers of the second kind, S(n, k) = { ^ }: 



5. s„= £{;}■ 

fe = i 


6. The first fifteen Bell numbers are: 

B 1 = l B 2 = 2 

B 5 = 52 B 6 = 203 

B g = 21,147 B 10 = 115,975 

B 13 = 27,644,437 B 1A = 190,899,322 


B 3 = 5 
S 7 = 877 
Bn = 678,570 
B 15 = 1,382,958,545. 


B 4 = 15 
B s = 4,140 
B 12 = 4,213,597 


7 . Table 1 lists some identities involving Stirling numbers. 

8. The following give combinatorial interpretations of some of the identities involving 
Stirling numbers: 

• Stirling cycle number recursion: When partitioning n objects into k cycles, there 
are [?“■,] ways in which the last object has a cycle to itself. Otherwise, there 
are [ n 7 ] ways to partition the other n — 1 objects into k cycles, and then n — 1 
choices of a location into which the last object can be inserted. 
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Table 1 Stirling number identities. 



• Stirling subset number recursion: When partitioning n objects into k nonempty 

subsets, there are { } ways in which the last object has a subset to itself. 

Otherwise, there are {"/7 1 } ways to partition the other n — 1 objects into k 
subsets, and then k choices of a subset into which the last object can be inserted. 

• Y^k=o [fe] = n - : The partitions into cycles are in a one-to-one correspondence 

with the permutations of n objects, since each permutation can be represented 
as a composition of disjoint cycles. 

Examples: 

1. x(x — l)(x — 2)(x — 3) = x 4 — 6x 3 + llx 2 — 6x, and hence there are [ 2 ] = 11 permu- 
tations of {1, 2, 3, 4} with 2 cycles: ( 12 ) ( 34 ) , ( 13 ) ( 24 ) , (14) (23), ( 1 )( 234 ), ( 1 )( 324 ), 
( 2 ) ( 134 ), (2) (314), ( 3 ) ( 124 ) , (3)(214), (4)(123), (4)(213). Also, s(4, 2) = (-1) 4 " 2 • 11. 
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2. x 4 = x(x—l)(x—2)(x—3)+6x(x—l)(x—2)+7x(x—l)+x, and hence there are exactly 
{ 2 } = 7 set-partitions of {1,2, 3,4} into two blocks: {1} & {2,3,4}, {2} & {1,3,4}, 
{3} & {1,2,4}, {4} & {1,2, 3}, {1,2} & {3,4}, {1,3} & {2,4}, {1,4} & {2,3}. 


2.6 BURNSIDE/POLYA COUNTING FORMULA 


Burnside’s Lemma and Polya’s formula are used to count the number of “really different” 
configurations, such as tic-tac-toe patterns and placement of beads on a bracelet, in 
which various symmetries play a role. One of the scientific applications of Polya’s 
formula is the enumeration of isomers of a chemical compound. From a mathematical 
perspective, Burnside/Polya methods count orbits under a permutation group action. 
(See §5.3.1.) 


2.6.1 PERMUTATION GROUPS AND CYCLE INDEX POLYNOMIALS 
Definitions: 

A permutation on a set S is a one-to-one mapping of S onto itself. In this context, 
the elements of S are called objects. 

A permutation tv of a finite set S is cyclic if there is a subcollection of objects that can 
be arranged in a cycle (ai<i 2 ■ • • a„) so that each object a 3 is mapped by tv onto the next 
object in the cycle and every object of S not in this cycle is fixed by n, that is, mapped 
to itself. 

The tabular form of a permutation tv on a finite set S' is a matrix with two rows. In 
the first row, each object from S is listed once. Below the object a is its image tv (a), in 
this form: 

(ax a 2 • • • a n 

\7r(ai) 7r(a 2 ) ••• 7r(a„) 

The cycle decomposition (form) of a permutation tv is a concatenation of cyclic per- 
mutations whose object subcollections are disjoint and whose product is tv. (Sometimes 
the 1-cycles are explicitly written and sometimes they are omitted.) 

A set P of permutations of a set S is closed under composition if the composition 
of each pair of permutations in P is also in P. 

A set P of permutations of a set S is closed under inversion if for every permutation 
TV G P, TV~ 1 G P. 

A permutation group Q = (P, S) is a nonempty set P of permutations on a set S 
such that P is closed under composition and inversion. 

The cycle structure of a permutation tv is an expression (multivariate polynomial) 
of the form x™ 1 x™ 2 . . . x ™ k , where rrij is the number of cycles of size j in the cyclic 
decomposition of tv. 
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The cycle index of a permutation group Q is the multivariate polynomial that is the 
sum of the cycle structures of all the permutations in Q, divided by the number of 
permutations in Q. The cycle index polynomial is written Pg(x i,x 2 , . . . ,x n ). (The 
notation Pg honors George Polya (1887-1985) who greatly advanced the application of 
the cycle index polynomial to counting.) 

Facts: 

1. Every permutation has a tabular form. 

2 . The tabular form of a permutation is unique up to the order in which the objects 
of the permuted set are listed in the first row. 

3 . Every permutation has a cycle decomposition. 

4. The cycle decomposition of a permutation into a product of disjoint cyclic permu- 
tations is unique up to the order of the factors. 

5. The collection of all permutations on a set S forms a permutation group. 

Examples: 

1. The permutation (“ b d c has the cycle decomposition ( ac)(bd ). 

2 . The symmetric group £3 of all 6 possible permutations on {a, b , c} has the following 
elements: 

(a)(6)(c), ( ab)(c ), (ac)(b), (a) (6c), ( abc ), ( acb ) 
with respective cycle structures 

xf, X 1 X 2 , X 1 X 2 , X 1 X 2 , x 3> x 3 . 

Thus, the cycle index polynomial is 

Ps 3 = \(x\ + 3x 3 x 2 + 2x 3 ) . 

3 . The group £4 of all 24 permutations on {a, b , c, d} has the following elements: 


(a)(b)(c)(d) 

(ab)(c)(d) 

(ac)(b)(d) 

(ad)(b)(c) 

(a)(bc)(d) 

(a)(bd)(c) 

(a)(b)(cd) 

(abc) (d) 

(acb)(d) 

(abd) (c) 

(adb)(c) 

(acd)(b) 

( adc)(b ) 

(a) (bed) 

(a)(bdc) 

(ab)(cd) 

(ac)(bd) 

(ad) (be) 

(abed) 

( abdc ) 

(acbd) 

(aedb) 

(adbc) 

(adeb) 

The cycle index polynomial is 






24 

;f + 6xfx 2 + 

8aqa;3 + 3x 

2 + 62:4] . 



2.6.2 ORBITS AND SYMMETRIES 
Definitions: 

Given a permutation group Q = (P, S), the orbit of a € S is the set { n(a) \ n € P }. 

A symmetry of a figure (or symmetry motion) is a spatial motion of the figure 
onto itself. 

Facts: 

1. Given a permutation group Q = (P, S), the relation R defined by 
aRb there exists 7 r G P such that 7r(a) = b 

is an equivalence relation (§1.4.2), and the equivalence classes under it are precisely the 
orbits. 
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2. The set of all symmetries on a figure forms a group. 

3. The set of symmetries on a polygon induces a permutation group action on its corner 
set and a permutation group action on its edge set. 

Examples: 

1. Acting on the set {a, b , c, d , e} is the following permutation group: 

(a) ( b ) (c) (d) (e) , ( ab ) (c) (d) (e) , (a) (b) (cd) (e) , and (ab) (cd) (e) . 

The orbits of this group are {a, b}, {c, d}, {e}. The cycle index is 

\ [xf + 2 xfx 2 + xix|] • 

2. A square with corners a, b, c, d (in clockwise order) has eight possible symmetries: 
four rotations in the plane around the center of the square and four reflections (which 
could also be achieved by 180° spatial rotations out of the plane). See the following 
figure. 


rotations 
0° (aXbXO(d) 

90° (abed) 

180° (a c)(b d) 

270° (a d c b) 

There is only one orbit, {a, b, c, d}, and the cycle index for the group of symmetries of 
a square acting on its corner set (the dihedral group D 4 ) is 


reflections 

horizontal axis (a dXb c) 
vertical axis (a bXc d) 
down-diagonal axis (aXcXb d) 
up-diagonal axis (bXdXa c) 


P Di = 


\X, 


2x4 


Zx\ 


■ 2xfx 2 ] ■ 


3. A pentagon has 10 different symmetries - five rotations in the plane around the 
center of the pentagon: 0° = (a)(6)(c)(d)(e), 72° = (abode), 144° = (acebd), 216° = 
(adbec), and 288° = ( aedeb ), and five reflections (or equivalently, spatial rotations 
of 180° out of the plane) around axis lines through a corner and the middle of an 
opposite side: (a)(be)(cd),(b)(ac)(de),(c)(ae)(bd),(d)(ab(ce), and ( e)(ad)(bc ). See the 
following figure. There is only one orbit, {a, b, c, d, e}, and the associated cycle index is 
4 ^ [xf + 4X5 + 5 x 1 X 2 ] . 



2.6.3 COLOR PATTERNS AND INDUCED PERMUTATIONS 
Definitions: 

A coloring of a set S from a set of n colors is a function from S to the set {1 ,n}, 
whose elements are regarded as “colors” . The set of all such colorings is denoted C(S, n). 

A corner coloring of a (polygonal or polyhedral) geometric figure is a coloring of its 
set of corners. 

An edge coloring of a geometric figure is a coloring of its set of edges. 
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Let ci and C 2 be colorings of the set S and let n be a permutation of S. Write 7r(ci) = C 2 
if ci (a) = C2(7r(a)) for every a £ S. The correspondence ci 1 — >ci o ir^ 1 is the map 
induced by ir on the colorings of S. (The composition Ci 07 r _1 assigns a color to 
every object a £ S, namely the color Ci(7r _1 (a)). 

Two corner colorings of a figure are equivalent if one can be mapped to the other by 
a symmetry. Similar definitions apply to edge colorings and to face colorings. 

Two colorings ci and C 2 of a set S are equivalent under a group Q = (P, S) if there 
is a permutation n £ P such that 7r(ci) = C 2 . 

A corner coloring pattern of a figure with respect to a set of symmetries is 

a set of mutually equivalent colorings of the figure. 

Facts: 

1. Let Q = (P, S) be a permutation group. Then the induced action of P on the 
set C(S,n ) of colorings with n colors is a permutation group action. 

2. When P acts on the set C(S, n) of colorings of S, the numbers of permuted objects 
and orbits, and the cycle index polynomial, are different from when P acts on S itself. 

3. In permuting the set S of corners of a figure, a symmetry of a figure simultaneously 
induces a permutation of the set of all its corner colorings. An analogous fact holds for 
edge colorings. 

Examples: 

1. In Example 2 of §2.6.2, a permutation group of 8 elements acts on the four corners 
of a square. There is only one orbit, and the cycle index is | \x\ + 2 x 4 + 3x\ + 2 X 4 X 2 ] ■ 
The following figure shows what happens when the same group acts on the set of black- 
white colorings. The permuted set has 16 colorings, there are 6 orbits, and the cycle 
index polynomial is | [a:} 6 + 2 x\ X 2 X 4 + 3xfx% + 2xfa;|]. 


□ n 

nnn 

non 

nnn 

0 — 0 « — 0 0 — 0 — 0 < 

1 — 0 0 — 0 0 — 0 0 — 1 

— 0 

0 — A 


2.6.4 FIXED POINTS AND BURNSIDE’S LEMMA 
Definition: 

An element a £ S is a fixed point of the permutation 7r if 7r(a) = a. The set of all 
fixed points of 7 r is denoted Gx(tt). 


Facts: 

1. The number of fixed points of a permutation 7 r equals the number of 1-cycles in its 
cycle decomposition. 
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2. Burnside’s Lemma: Let Q be a group of permutations acting on a set S. Then the 
number of orbits induced on S is given by 


1 

W\ 


TT(zG 


where Bx(tt) = { x £ S \ n(x) = x }. 


Note: The theorem commonly called “Burnside’s Lemma” originated with Georg Frobe- 
nius (1848-1917). A widely available book by William Burnside (1852-1927) published 
in 1911 stated and proved the same result, without mentioning its prior discovery. 


3. Evaluation of the sum in Burnside’s Lemma is simplified by using the cycle index 
polynomial and Fact 1. For each term in the polynomial, multiply the coefficient by the 
exponent of x\, and then sum these products. 


4. Special Burnside's Lemma ( for colorings): Let Q be a group of permutations acting 
on a set S. Then the number of orbits induced on C(S, n) (the set of colorings of S 
from a set of n colors) is given by substituting n for each variable in the cycle index 
polynomial. 


5. The following table gives information on the number of corner coloring patterns of 
selected figures. 


figure 

colors 


2 

3 

4 

m 

triangle 

4 

11 

20 

g [m 3 + 3 m 2 + 2m] 

square 

6 

21 

55 

| [m 4 + 2 to 3 + 3 m 2 + 2m] 

pentagon 

8 

39 

136 

Pp [m 5 + 5 m 3 + 4m] 

hexagon 

13 

92 

430 

py [m 6 + 3m 4 + 4m 3 + 2m 2 + 2m] 

heptagon 

18 

198 

1,300 

pj [m 7 + 7 m 4 + 6m] 

octagon 

30 

498 

4,183 

A [m 8 + 4m 5 + 5m 4 + 2 m 2 + 4m] 

nonagon 

46 

1,219 

15,084 

jp [?B 9 + 9m 5 + 2 m 3 + 6m] 

decagon 

78 

3,210 

53,764 

7pj [?n 10 + 5m 6 + 6m 5 + 4m 2 + 4m] 

tetrahedron 

5 

15 

36 

A [to 4 + 11m 2 ] 

cube 

23 

333 

2914 

gp[m 8 + 17m 4 + 6m 2 ] 


Examples: 

1. In Example 1 of §2.6.2, the permutation group is 

{(a)(6)(c)(d)(e), (afe)(c)(d)(e), (a)(6)(cd)(e), (, ab)(cd)(e )}. 

The cycle index is | [xf + 2x\x2 + X\X%\. By Burnside’s Lemma and Fact 3 there are 
j[l-5 + 2- 3 + l-l] = ^y = 3 orbits. The orbits are {a, 5}, {c, d}, {e}. 

2. Example 1 of §2.6.3 shows 16 colorings of the corners of the square with colors 
black or white. There are 6 orbits, and the cycle index for the action on the colorings 
is | \x\ 6 + 2x%X2x\ + 3x^X2 + 2x^xV\- By Burnside’s Lemma and Fact 3, there are 
|[l-16 + 2- 2 + 3- 4 + 2-8] = f= 6 orbits. 

It is simpler to apply Special Burnside’s Lemma to the cycle index for the action 
on the square (from Example 2 of §2.6.2), g \x\ + 2xi + 3x\ + 2x\x2\, which yields 
| [1 • 2 4 + 2 • 2 + 3 • 2 2 + 2 • 2 2 • 2] =6 orbits. 
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3 . ( Continuing Example 3 of §2.6.2): The cycle index of the group of symmetries 
of the pentagon is yy [x 5 + 4^5 + 5xrx|] . By Special Burnside’s Lemma, the number 
of m-colorings of the corners of an unoriented pentagon is yy [to 5 + 4?n + 5?n 3 ] . For 
to = 3, the formula gives yy(243 + 12 + 135) = 39 3-coloring patterns of a pentagon. 

4 . A cube has 24 rotational symmetries, which act on the corners. The identity symme- 
try has cycle structure xf . There are three additional classes of symmetries, as follows: 

(a) Rotations of 90°, 180°, or 270° about an axis line through the middles of opposite 
faces, for example, through abed, and efgh in part (a) of the following figure. A 90° 
rotation, such as ( abed) (efgh ), has cycle structure x|. All 270° rotations have that 
same structure. A 180° rotation, such as ( ac){bd){eg){fh ), has cycle structure xf. 
There are three pairs of opposite faces, and so the total contribution to the cycle index 
of opposite-face rotations is 6x| + 3x2- 



(a) rotation about (b) rotation about (c) rotation about 

opposite faces opposite edges opposite corners 

(b) Rotating 180° about an axis line through the middles of opposite edges, for example, 
through edges ad and fg in part (b) of the figure. This rotation, (ad)(bh)(ce)(fg), has 
cycle structure xf. There are six pairs of opposite edges, and so the total contribution 
of opposite-edge rotations is 6x2- 

(c) Rotating 120° or 240° about an axis line through opposite corners, for example, 
about the line through corners a and g in part (c) of the figure. Any 120° rotation, such 
as ( a)(bde)(chf)(g ), has cycle structure X4X3. A 240° rotation has the same structure. 
There are four pairs of opposite corners, and so the contribution of opposite-corner 
rotations is 8x4X3. 

Collect terms to obtain the cycle index 54 [xf + 6x4 + 9xf + 8x4X3] . Thus, the number 
of TO-colorings of the corners of an unoriented cube is yy [m s + 6 to 2 + 9m 4 + 8 to 4 ] . For 
to = 2 and 3, the formula gives 23 2-coloring patterns and 333 3-coloring patterns. 


2.6.5 POLYA’S ENUMERATION FORMULA 
Definition: 

A pattern inventory is a generating function (§3.2) that enumerates the numbers of 
coloring patterns of a given figure. 

Facts: 

1. Polya’s enumeration formula : Let Q = {P,S) be a permutation group and let 
{c4 ,...,c„} be a set of names for n colors for the objects of S. Then the pattern 
inventory with respect to Q for the set of all n-colorings of S is given by substituting 
(cj + • • • + c J n ) for Xj in the cycle index Pg(x 4 , . . . , x m ). 

Note : This theorem was published in 1937. Essentially the same result was derived by 
H. Redfield in 1927. 
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2. Polya’s enumeration formula has many applications in enumerating various families 
of graphs. This approach was pioneered by F. Harary. (See [HaPa73].) 

3 . Polya’s enumeration formula has many applications in which some practical question 
is modeled as a graph coloring problem. 

Examples: 

1 . The pattern inventory of black- white colorings of the corners of a triangle is 

16 3 + 1 b 2 w + 1 bw 2 + lu> 3 . 

This means there is one coloring pattern with all 3 corners black, one with 2 black 
corners and 1 white corner, etc. 

2 . ( Continuing Example 2 of §2.6.4): For corner colorings of the square, the cycle 
index is 

Pd 4 {xi, x 2 ,x 3 ,x 4 ) = | [xf + 2xfx 2 + 3xl + 2x 4 ] ■ 

By Polya’s enumeration formula, the pattern inventory for black-white colorings of the 
corners of the square is 

-Pz> 4 [(& + w), ( b 2 + w 2 ), (6 3 + w 3 ), ( b 4 + u> 4 )] 

= | [(6 + w) 4 + 2(6 + w) 2 {b 2 + w 2 ) + 3(6 2 + w 2 ) 2 + 2(6 4 + ru 4 )] 

= | [86 4 + 8b 3 w + 16b 2 w 2 + 8bw 3 + 8w 4 ] = 16 4 + lb 3 w + 2b 2 w 2 + lbw 3 + lw 4 . 

This pattern inventory may be confirmed by examining the drawing in Example 1 
of §2.6.3. 

3 . ( Continuing Example 3 of §2.6.4): For corner colorings of the pentagon, the cycle 
index is 

Pd 5 (xi,x 2 , ...,x 5 ) = j 5 [x\ + 4x 5 + 5£ix!] . 

By Polya’s enumeration formula, the pattern inventory for black-white colorings of the 
corners of the pentagon (confirmable by drawing pictures) is 

Pn 5 ((b + w), (6 2 + iv 2 ), ( 6 3 + re 3 ), (6 4 + w 4 ), (6 5 + w 5 )) 

= ^ [(6 + w ) 5 + 4(6 5 + w 5 ) + 5(6 + w)(b 2 + tu 2 ) 2 ] 

= ^ [106 5 + 106 4 w + 20b 3 w 2 + 20b 2 w 3 + lObw 4 + 10w 5 ] 

= 16 5 + lb 4 w + 2b 3 w 2 + 2b 2 w 3 + lbw 4 + lw 5 . 

4. ( Continuing Example 4 of §2.6.4): For corner colorings of the cube, the cycle index 
is 

Pq{x x 4 ) = [xf + 6x 2 + 9x| + 8x 2 x 2 ] . 

By Polya’s enumeration formula, the pattern inventory for black-white colorings of the 
corners of the cube is 

Pg (( b + w ), (6 2 + w 2 ), ( 6 3 + w 3 ), (6 4 + tu 4 )) 

= ± [(6 + w) 8 + 6(6 4 + w 4 ) 2 + 9(6 2 + w 2 ) 4 + 8(6 + w) 2 {b 3 + w 3 ) 2 ] 

= 6 s + b 7 w + 3b 6 w 2 + 3b 5 w 3 + 7b 4 w 4 + 3b 3 w 5 + 3b 2 w 6 + bw 7 + w 8 . 
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5. Organic chemistry: Two structurally different compounds with the same chemical 
formula are called isomers. For instance, to two of the six carbons (C) in a ring there 
might be attached a hydrogen ( H ), and to each of the four other carbons some other 
radical (R), thereby yielding the chemical formula CsHzRi- The number of different 
isomers (structurally different arrangements of the radicals) is the same as the number 
of coloring patterns of a hexagon when two of the corners are “colored” H and four 
“colored” R. The cycle index for the symmetries of a hexagon, in terms of corner 
permutations, is 

Pd 6 (x i, . . . ,xq) = jg [xf + 2x6 + 2x 2 + 4x2 + 3xix%]- 

Substituting ( H 3 + R 3 ) for x 3 yields a pattern inventory listing the number of isomers 
of CeHiRe-i : 

± [(H + R) 6 + 2 (H 6 + R 6 ) + 2 (H 3 + R 3 ) 2 + 4 (H 2 + R 2 ) 3 + 3 (H + R) 2 (H 2 + R 2 ) 2 ] 

= ± [12 H 8 + 12 H 5 R + 36H 4 R 2 + 36 H 3 R 3 + 36 H 2 R 4 + 12 HR 5 + 12 i? 6 ] 

= 1 H 8 + 1 H 5 R + 3 H 4 R 2 + 3 H 3 R 3 + 3 H 2 R 4 + 1 HR 5 + 1 R 6 . 

The three possible coloring patterns corresponding to 3 H 2 R 4 are shown in the following 
figure: 



2.7 MOBIUS INVERSION COUNTING 


Mobius inversion is an important tool used to solve a variety of counting problems such 
as counting how many numbers are relatively prime to some given number (without 
individually checking each smaller number) and counting certain types of circular ar- 
rangements. It generalizes the principle of inclusion/exclusion. (Augustus Ferdinand 
Mobius, 1790-1868) 


2.7.1 MOBIUS INVERSION 


Definitions: 


The Kronecker delta function S(x, y) is defined by the rule 



if x = y 
otherwise. 
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The Mobius function is the function p from the set of positive integers to the set of 
integers where 

{ 1 if m = 1 

(— l) fc if to = P1P2 ■ ■ -Pk (the product of k distinct primes) 

0 if m is divisible by the square of a prime. 

Note: See Chapter 11 for Mobius functions defined on partially ordered sets. 


Facts: 

1. For the Mobius function p defined on the set of positive integers: 

• p is multiplicative: if gcd(?7i,?r) = 1, then p[mn) = p(m)p(n ); 

• p is not completely multiplicative: p(mn) = p(m)p(n) is not always true; 

• p(d) = l \ } where the sum is taken over all positive divisors of n. 

fa 10 ifn>l 

2 . Mobius inversion formula: If f(n) and g(n) are defined for all positive integers and 
f(n) = E 9(d), then g(n) = E h(d)f(n/d). 

d\n d\n 


3. For every positive integer n, n = E 4 >{d)- (For example, 6 = 0(1) + 0(2) + 0(3) + 

d\n 

0(6) = 1 + 1 + 2 + 2.) 

Examples: 

1. Circular permutations with repetitions: Given an alphabet of m letters, how many 
circular permutations of length n are possible, if repeated letters are allowed and two 
permutations are the same if the second can be obtained from the first by rotation? 
The problem was first solved by Percy A. MacMahon in 1892. 

A circular permutation of length n has a period d , where d\n. (The period of a 
circular permutation, viewed as a circular string, is the length of the shortest substring 
that repeats end-to-end to give the entire string.) Let g(d) be the number of length d 
circular permutations that have period d. A circular permutation of length n can be 
constructed from one of length d (where d\n) by concatenating it with itself times. 
For example, the circular permutation aabbciabb (where beginning and end are joined) of 
period four can be obtained by taking the circular permutation aabb and opening it up 
at one of four spots between the letters, to obtain any of four linear strings aabb , abba , 
bbaa , and baab. Join one of these to itself, obtaining aabbaabb , abbaabba , bbaabbaa, and 
baabbaab , and then join the beginning and the end to form the circular permutation 
aabbaabb. 

For any positive integer k, there are dg(d) linear strings of length k obtained by 
taking k repetitions of the linear strings of length d that have period d, where d\k. 
Therefore, the total number of linear strings of length k where the objects are chosen 
from to types is Ed|fc dg(d) = m k . Applying the Mobius inversion formula to m k and g 
yields g(k) = | Ed|fc g{d)m k ^ d . Therefore, the total number of circular permutations of 
length n where the elements are chosen from an alphabet of size to is Ed|n which 
is equal to 
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E*|„(£E d | k h(d)m k / d ). 



2. Circular permutations with repetitions with specified numbers of each type of object: 
Suppose there are a total of n objects of t types, with a* of type i (i = 1, . . . , f), where 
«i + • • • + at = n. If a = gcd(ai, . . . , at), these circular permutations can be generated 
as in Example 3 by taking a circular permutation of period d (where (i|o) with 
objects of type i {i = 1 ,...,<), breaking it open, and laying it end-to-end ^ times. 
Let g{k) be the number of such circular permutations of length k that have period k. 
Then the total number of linear strings of length n with a.; objects of type i is given by 
J2d\ a dg(d) = ai! n ' af! • 

By the Mobius inversion formula, 

= k J2d\a h(d) ( ai /d)\(a 2 jd)\...(a t /d)\ ' 

Summing g{k) over all divisors of a gives the desired total number of circular permuta- 
tions: 

J2k\a9(k) = Sfc|a(i; Sd|a ( ai /d)l.\a t /d)\] ' 


2.8 YOUNG TABLEAUX 


Arrays called Young tableaux were introduced by the Reverend Alfred Young (1873- 
1940). These arrays are used in combinatorics and the theories of symmetric functions, 
which are the subject of this section. Young tableaux are also used in the analysis 
of representations of the symmetric group. They make it possible to approach many 
results about representation theory from a concrete combinatorial viewpoint. 


2.8.1 TABLEAUX COUNTING FORMULAS 
Definitions: 

The hook H, j of cell [i, j) in the Ferrers diagram for a partition A is the set 
{ (k, j) € A | k > i } U { ( i , k) e A | k > j }, 

that is, the set consisting of the cell (i, j), all cells in its row to its right, and all cells in 
its column below it. 

The hooklength hij of cell (i, j) is the number \H, j\ of cells in its hook. 

A Young tableau is an array obtained by replacing each cell of the Ferrers diagram 
by a positive integer. 

The shape of a Young tableau is the partition corresponding to the underlying Ferrers 
diagram. The notation A h n indicates that A partitions the number n. 

A Young tableau is semistandard (an SSYT) if the entries in each row are weakly 
increasing and the entries in each columns are strictly increasing. 

A semistandard Young tableau of shape A b n is standard (an SYT) if each number 
1, . . . , n occurs exactly once as an entry. The number of SYT of shape A is denoted f\. 
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If G is a group (see §5.2) then an involution is an element g £ G such that g 2 is the 
identity. The number of involutions in the symmetric group S n (or £„) (the group of 
all permutations on the set {1,2,..., n}) is denoted inv(n). 

The following table summarizes notation for Young tableaux: 


notation 

meaning 

A = (Ai, • • • , A /) 
Ahn 

(m) 

Hij 

hij 

/a 

inv(?r) 

partition (with parts Ay > A2 > • ■ ■ > A;) 

A partitions the number n 
cell in a Ferrers diagram 
hook of cell (i, j) 
hooklength of hook H h] 
number of SYT of shape A 
number of involutions in S n 


Facts: 

1 . Frame-Robinson-Thrall hook formula [1954]: The number of SYT of fixed shape A 


2 . Frobenius determinantal formula [1900]: The number of SYT of fixed shape A = 
(Ai, . . . , A;) is the determinant 


f\ = n\ 


l 

(A i+j-i)\ 


1 <i,j<l 


3 . Summations involving the number of SYT: 

E /a = inv(ra) E fl = n! - 

Ah n Ahn 

4 . Young tableaux can be used to find the number of permutations with specified 
lengths of their longest increasing subsequences and longest decreasing subsequences. 
[Be71] 


Examples: 

1. If A = (3, 2) then a complete list of SYT is: 

123 124 125 134 135 

45 35 34 25 24 

2 . If A = (2, 2) then a complete list of SSYT with entries at most 3 is: 

11 11 22 11 12 12 

22 33 33 23 23 33 

3 . For the partition (3,2), = {(1,1), (2,1), (1,2), (1,3)}. In the following dia- 

gram each cell of (3, 2) is replaced with its hooklength. 

4 3 1 
2 1 

The hook formula (Fact 1) gives the number of SYT of shape (3, 2): /( 3 2 ) = 4 . 3 3 * 5 2 . l2 = 5. 
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The determinantal formula (Fact 2) gives the same result: 


/( 3,2) = 5! 


3! 4! 

J_ J_ 
1 ! 2 ! 


= 5. 


4. For the partitions of n = 3, /( 3 ) = 1, /( 2 . 1 ) = 2, /(i,i,i) = 1, so the summation 
formulas become: 

E A = 4 = inv(3); 

Ah3 

E/a 2 = 6 = 3!. 

AH3 


2.8.2 TABLEAUX ALGORITHMS 

Definitions: 

An inner corner of a partition A is a cell (i,j) € A such that (z + 1 ,j), ( i,j + 1) /<S\. 
An outer corner of a partition A is a cell (i. j) /<S\ such that (z — 1 ,j), ( i,j — 1) € A. 


Algorithm 1 : Generate at random a standard tableau of given shape. 

input: a shape A / such that A j b n {For a summary of notation, see §2.5.1.} 
output: a standard Young tableau of shape A/, uniformly at random 

A := A j 

while A is nonempty 

{find an inner corner (i,j) £ A} 

choose (with probability j^-) any cell (z, j) € A 

while the current cell (i,j) is not an inner corner 

choose (with probability ) a pair (z', j') G Hij — {(z, j)} 

(i, j) :=(E f) 

assign label n to inner corner (i,j) 

A := A - 


Examples: 

1. The diagrams in the following figure illustrate a plausible sequence of current cells 
chosen as the first step of the Greene-Nijenhuis-Wilf algorithm [1979] (Algorithm 1) 
finds an inner corner of a tableau of shape A = (5, 5, 5, 2). 


A = (5 ,5 .5 ,2) 
n= 17 




pr(c) = 


17 


pnc)= j- 


pr(c) = 


1 
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Algorithm 2: Robinson- Schensted. 

input: a permutation ir £ S n where n = ( ^ U ) 

y 7Ti 7T2 • • • 7T n J 

output: a pair (P, Q) of standard Young tableaux of the same shape A h n 

P 0 := 0; Q 0 := 0 

for k := 1 to n 

r := 1; c := 1; 6 := irk', Pk '■= Pfc-i; exit := FALSE 
while ezzf = FALSE 

{find next insertion row r in tableau P;- } 
while row r (Pk) yf 0 and tt, > max{rouy(Pfc)} 
r := r + 1 

{find next insertion column c in tableau Pk} 
c := 1 

while P k [r, c] ^ 0 and 7r fc < P fc [r, c] 
c := c + 1 
{insert b} 

if Pk[r , c] = 0 then 
P/t[r, c] := 6; exit = TRUE 
else 

66 := Pfc[r,c]; P fc [r,c] := 6; 6 := 66 
Qfc[f,c] := k 

P ■ = Pn ; <2 := Qn 

yields this sequence of tableaux 
pairs (Pfc, Qk) under the Robinson-Schensted algorithm [1938, 1961] (Algorithm 2). 


2. The permutation 7r = 


1 2 3 4 5 6 7 

6 2 3 1 7 5 4 
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INTRODUCTION 


Sequences of integers occur regularly in combinatorial applications. For example, the 
solution to a counting problem that depends on a parameter k can be viewed as the kth 
term of a sequence. This chapter provides a guide to particular sequences that arise in 
applied settings. Such (infinite) sequences can often frequently be represented in a finite 
form. Specifically, sequences can be expressed using generating functions, recurrence 
relations, or by an explicit formula for the kth term of the sequence. 


GLOSSARY 

antidifference (of a function /): any function g such that A g = f. It is the discrete 
analogue of antidifferentiation. 

ascent (in a permutation 7 r): any index i such that 7r, < 

asymptotic equality (of functions): the function f(n) is asymptotic to g(n), written 
f(n) ~ g(n), if f(n) yf 0 for sufficiently large n and linin^^ = 1. 

Bernoulli numbers: the numbers B n produced by the recursive definition Bq = 1, 

Bernoulli polynomials: the polynomial B m (x) = EEo ^k)BkX m k w h ere Bk is the 
kth Bernoulli number. 

big-oh (of the function /): the set of all functions that do not grow faster than some 
constant multiple of /, written 0(f(n)). 

big omega (of the function /): the set of all functions that grow at least as fast as 
some constant multiple of /, written f l(f(n)). 

big theta (of the function /): the set of all functions that grow roughly as fast as some 
constant multiple of /, written 0(/(n)). 

binomial convolution (of the sequences {a„} and {©}): the sequence whose rth 
term is formed by summing products of the form ((,)afc& r _fc. 

Catalan number : the number C n = ;j l | ( ( 2 "). 

characteristic equation: an equation derived from a linear recurrence relation with 
constant coefficients, whose roots are used to construct solutions to the recurrence 
relation. 

closed form (for a sum) : an algebraic expression for the value of a sum with variable 
limits, which has a fixed number of terms; hence the time needed to calculate it does 
not grow with the size of the set or interval of summation. 

convolution (of the sequences {a n } and {&„}): the sequence whose rth term is formed 
by summing products of the form a,kb r -k where 0 < k < r. 

deBruijn sequence: a circular ordering of letters from a fixed alphabet with p letters 
such that each n consecutive letters (wrapping around from the end of the sequence 
to the beginning, if necessary) forms a different word. 

difference operator: the operator A where A f{x) = f(x + 1) — /( x) on integer or 
real- valued functions. It is the discrete analogue of the differentiation operator. 

difference sequence (for the sequence A = { ctj \ j = 0, 1, . . . }): the sequence A A = 
{aj+i - a 3 | j = 0,1,. . .}. 
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difference table (for a function /): a table whose kth row is the /cth difference se- 
quence for /. 

discordant permutation : a permutation that assigns to every element an image dif- 
ferent from those assigned by all other members of a given set of permutations. 

dissimilar hypergeometric terms: terms in two hypergeometric series such that 
their ratio is not a rational function. 

divide-and-conquer algorithm: a recursive procedure that solves a given problem by 
first breaking it into smaller subproblems (of nearly equal size) and then combining 
their respective solutions. 

doubly hypergeometric: property of function F(n, k) that and 

are rational functions of n and k. 

Eulerian number : the number of permutations of {1, 2, ... , n} with exactly k ascents. 

excedance (of a permutation 7r): any index i such that 7 > i. 

exponential generating function (for the sequence do, ai, a 2 , . . .): the function /( x) 
2 

= do + aiX + d 2 fj- + • • • or any equivalent closed form expression. 

falling power (of x ): the product x— = x(x — l)(x — 2) . . . (x — n + 1) of n successive 
descending factors, starting with x ; the discrete analogue of exponentiation. 

Fibonacci numbers: the numbers F n produced by the recursive definition Fq = 0, 
Fi = 1, F n = 1 + F n _ 2 if n > 2. 

figurate number: the number of cells in an array of cells bounded by some regular 
geometrical figure. 

first-order linear recurrence relation with constant coefficients: an equation 
of the form C n +ia n +i + C n a n = f(n), n > 0, with C n +\, C n nonzero real constants. 

generating function (for the sequence do,di,d 2 , . . .): the function f(x) = do + ai:r-|- 
d 2 X 2 + • • • or any equivalent closed form expression; sometimes called the ordinary 
generating function for the sequence. 

geometric series: an infinite series where the ratio between two consecutive terms is 
a constant. 

Gray code (of size n): a circular ordering of all binary strings of length n in which 
adjacent strings differ in exactly one bit. 

harmonic number: the sum H n = 1 which is the discrete analogue of the 

natural logarithm. 

homogeneous recurrence relation: a recurrence relation satisfied by the identically 
zero sequence. 

hypergeometric series: is a series where the ratio of two consecutive terms is a 
rational function. 

indefinite sum (of the function /): the family of all antidifferences of /. 

Lah coefficients: the coefficients resulting from expressing the rising factorial in terms 
of the falling factorials. 

linear recurrence relation with constant coefficients: an equation of the form 

Cn+ka n +k+C n+k -ia n+ k-i-\ \-C n a n = f(n), n > 0, where C n+i are real constants 

with C n+ k and C n nonzero. 

little-oh (of the function /): the set of all functions that grow slower than every 
constant multiple of /, written o(/(n)). 
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little omega (of the function /): the set of all functions that grow faster than every 
constant multiple of /, written 

Lucas numbers : the numbers L n produced by the recursive definition L$ = 2, L\ = 1, 
L n = L n _ i + L n _ 2 if n > 2. 

nonhomogeneous recurrence relation: a recurrence relation that is not homoge- 
neous. 

polyomino: a connected configuration of regular polygons (for example, triangles, 
squares, or hexagons) in the plane, generalizing a domino. 

power sum: the sum of the fcth powers of the integers 1,2 , ... ,n. 

radius of convergence (for the series a n xU ) : the number r (0 < r < oo) such that 
the series converges for all |rc| < r and diverges for all |a;| > r. 

Ramsey number: the number R(m, n) defined as the smallest positive integer k with 
the following property: if S is a set of size k and the 2-element subsets of S are 
partitioned into 2 collections, C\ and C 2 , then there is a subset of S of size m such 
that each of its 2-element subsets belong to C\ or there is a subset of S of size n 
such that each of its 2-element sets belong to C 2 - 

recurrence relation: an equation expressing a term of a sequence as a function of 
prior terms in the sequence. 

rising power (of x): the product x n = x(x + 1)© + 2) . . . (x + n — 1) of n successive 
ascending terms, starting with x. 

second-order linear recurrence relation with constant coefficients: an equa- 
tion of the form C n+2 a n+2 + C n+1 a n+ i+C n a n = /(n), n > 0, where C n+2 , C n+ i, C n 
are real constants with C n + 2 and C n nonzero. 

sequence: a function from {0, 1,2,.. .} to the real numbers (often the integers). 

shift operator: the operator E defined by Ef(x ) = f(x + 1) on integer or real-valued 
functions. 

similar hypergeometric terms: terms in two hypergeometric series such that their 
ratio is a rational function. 

standardized form for a sum: a sum over an integer interval, in which the lower limit 
of the summation is zero. 

Stirling’s approximation formula: the asymptotic estimate xn(n/e) n for n\. 

tangent numbers: numbers generated by the exponential generating function tana;. 


3.1 SPECIAL SEQUENCES 


3.1 .1 REPRESENTATIONS OF SEQUENCES 

A given infinite sequence 00 , 01 , 02 , .. • can often be represented in a more useful or more 
compact form. Namely, there may be a closed form expression for a n as a function of n, 
the terms of the sequence may appear as coefficients in a simple generating function, 
or the sequence may be specified by a recurrence relation. Each representation has 
advantages, in either defining the sequence or establishing information about its terms. 
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Definitions: 


A sequence { a n \ n > 0 } is a function from the set of nonnegative integers to the real 
numbers (often the integers). The terms of the sequence { a n \ n > 0 } are the values 
&o> a i, 02 , . . . . 

A closed form for the sequence {a n } is an algebraic expression for a n as a function 
of n. 

A recurrence relation is an equation expressing a term of a sequence as a function of 
prior terms in the sequence. 

A solution of a recurrence relation is a sequence whose terms satisfy the relation. 

OO 

The generating function for the sequence {a n } is the function f(x) = Y a,x* or any 

*= o 

equivalent closed form expression. 

The exponential generating function for the sequence {a n } is the function ij(x) = 

OO i 

Y a,i\ or any equivalent closed form expression. 

»= o ’■ 

Facts: 

1. An important way in which many sequences are represented is by using a recurrence 
relation (§3.3). Although not all sequences can be represented by useful recurrence 
relations, many sequences that arise in the solution of counting problems can be so 
represented. 

2. An important way to study a sequence is by using its generating function (§3.2). 
Information about terms of the sequence can often be obtained by manipulating the 
generating function. 


Examples: 

1. The Fibonacci numbers F n (§3.1.2) arise in many applications and are given by the 
sequence 0, 1, 1, 2, 3, 5, 8, 13, ... . This infinite sequence can be finitely encoded by means 
of the recurrence relation 

F n = F r j_i + F n _ 2 , n > 2, with Fq = 0 and F\ = l. 

Alternatively, a closed form expression for this sequence is given by 



The Fibonacci numbers can be represented in a third way, via the generating function 
f{x) = i-x-x 2 • Namely, when this rational function is expanded in powers of x, the 
resulting coefficients generate the sequence values F n : 

x _Yx 2 = Oa;0 + 1x1 + 1x2 + 2x3 + 3xi + 5x5 + 8x6 + 13x7 + ■■■■ 

2. Table 1 gives closed form expressions for the generating functions of several com- 
binatorial sequences discussed in this Handbook. In this table, r is any real number. 
Generating functions for other sequences can be found in §3.2.1, Tables 1 and 2. 

3. Table 2 gives closed form expressions for the exponential generating functions of 
several combinatorial sequences discussed in this Handbook. Generating functions for 
other sequences can be found in §3.2.2, Table 3. 

4. Table 3 gives recurrence relations defining particular combinatorial sequences dis- 
cussed in this Handbook. 
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Table 1 Generating functions for particular sequences. 


sequence 

notation 

reference 

closed form 

1 , 2 , 3,4, 5 , . . . 

M 


(i-*) 2 

I 2 , 2 2 , 3 2 , 4 2 , 5 2 , . . . 

{n 2 } 


l-\-x 

(l-x) 3 

I 3 , 2 3 , 3 3 , 4 3 , 5 3 , . . . 

{n 3 } 


l+4x+x 2 

(1-x) 4 

1, r, r 2 , r 3 , r 4 , . . . 

{r n } 


1 

1 — rx 

Fibonacci 

F n 

§3.1.2 

X 

1—x—x 2 

Lucas 

L n 

§3.1.2 

2-x 

1—x—x 2 

Catalan 

c n 

§3.1.3 

l-x/1-4® 

2x 

Harmonic 

H n 

§3.1.7 

In jF- 

1—x 1—x 

Binomial 

O 

§2.3.2 

(l + x) m 


Table 2 Exponential generating functions for particular sequences. 


sequence 

notation 

reference 

closed form 

1 , 1 , 1 , 1 , 1 ,... 

{1} 



e x 


l,r, r 2 , r 3 , r 4 , . . . 

{r n } 



e rx 


Derangements 

D n 

§2.4.2 


e~ x 

1 — X 


Bernoulli 

B n 

§3.1.4 


X 

e x — l 


Tangent 

T n 

§3.1.7 


tan x 


Euler 

F n 

§3.1.7 


sech x 


Euler 

1 Fn | 

§3.1.7 


secx 


Stirling cycle number 

rn] 

L/cJ 

§2.5.2 

1 

k\ 

ln ,. 1 y 

(l-x) 

K 

Stirling subset number 


§2.5.2 

1 

k! 

[e x - l} k 



3.1.2 FIBONACCI NUMBERS 

Fibonacci numbers form an important sequence encountered in biology, physics, number 
theory, computer science, and combinatorics. [BePhHo88] , [PhBeHo86] , [Va89] 

Definitions: 

The Fibonacci numbers F 0 , Fi, i© . . . are produced by the recursive definition F 0 = 0, 
F\ = 1, F n = F r j_i + F n - 2 , n > 2. 

A generalized Fibonacci sequence is any sequence Go, Gi, G 2 , ■ ■ ■ such that G n = 
G n - 1 + G n - 2 for n > 2. 

The Lucas numbers Lo, L\, L 2 , . . . are produced by the recursive definition Lq = 2, 
Li = 1, = L„_i + L„_ 2 , n > 2. (Frangois Lucas, 1842-1891) 
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Table 3 Recurrence relations for particular sequences. 


sequence 

notation 

reference 

recurrence relation 

Derangements 

D n 

§2.4.2 

D n = (n — 1 )(D n _i + D n _2 ), 




Do = 1, D\ = 0 

Fibonacci 

F n 

§3.1.2 

Fn = F n - 1 + F n _ 2 , Fo = 0, F\ = 1 

Lucas 

Ln 

§3.1.2 

L n — L n —\ + L n — 2 , Lq — 2, L\ — 1 

Catalan 

C n 

§3.1.3 

C n = C 0 C n -i + C'iC n _2 + ’ • • 




+C n -iCo, C 0 = 1 

Bernoulli 

B n 

§3.1.4 

n 

E 0 'T)Bj = 0, Bo = 1 

j=0 

Eulerian 

E(n, k) 

§3.1.5 

E(n, k) = (k + 1 )E{n — 1, k ) 




+(n — k)E(n — 1, k — 1), 

E(n, 0) = 1, n > 1 

Binomial 

© 

§2.3.2 

(!) = (V) + (;:!). 




(”) = 1, n > 0 

Stirling cycle 

K] 

§2.5.2 


number 



[o] = 1; [o]=0. ">! 

Stirling subset 

(3 

§2.5.2 

(3 = M"3} + {*:!}. 

number 



{o} = !; {o}=0, n > 1 


Facts: 

1. The Fibonacci numbers F n and Lucas numbers L n for n = 0, 1, 2, . . . , 50 are shown 
in Table 4. 

2. The Fibonacci numbers were initially studied by Leonardo of Pisa (c. 1170-1250), 
who was the son of Bonaccio; consequently these numbers have been called Fibonacci 
numbers after Leonardo, the son of Bonaccio (Filius Bonaccii). 

3. lim ;f % ±1 = lim = ^(1 + \/5) ss 1.61803, the golden ratio. 

4. Fibonacci numbers arise in numerous applications in many different areas. For 
example, they occur in models of population growth of rabbits (Example 3), in modeling 
plant growth (Example 8), in counting the number of bit strings of length n without 
consecutive 0s (Example 13), in counting the number of spanning trees of wheel graphs 
of length n (Example 12), and in a vast number of other contexts. See [Va89] or other 
books concerning the Fibonacci numbers. There is a journal, the Fibonacci Quarterly, 
devoted to the study of the Fibonacci numbers and related topics. This is a tribute 
to how widely the Fibonacci numbers arise in mathematics and its applications to 
other areas. There are also a large number of books, available through the Fibonacci 
Association, devoted to the Fibonacci numbers and their use. This list can be found on 
the World Wide Web at 

www . sdstate . edu/^wcsc/http/f ibbooks . html 
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Table 4 Fibonacci and Lucas numbers. 


n 

K 

Ln 

n 

F n 

Ln 

n 

F n 

Ln 

0 

0 

2 

17 

55 

3571 

34 

5,702,887 

12,752,043 

1 

1 

1 

18 

89 

5778 

35 

9,227,465 

20,633,239 

2 

1 

3 

19 

144 

9349 

36 

14,930,352 

33,385,282 

3 

2 

4 

20 

233 

15127 

37 

24,157,817 

54,018,521 

4 

3 

7 

21 

377 

24476 

38 

39,088,169 

87,403,803 

5 

5 

11 

22 

610 

39603 

39 

63,245,986 

141,422,324 

6 

8 

18 

23 

987 

64,079 

40 

102,334,155 

228,826,127 

7 

13 

29 

24 

1,597 

103,682 

41 

165,580,141 

370,248,451 

8 

21 

47 

25 

2,584 

167,761 

42 

267,914,296 

599,074,578 

9 

34 

76 

26 

4,181 

271,443 

43 

433,494,437 

969,323,029 

10 

55 

123 

27 

6,765 

439,204 

44 

701,408,733 

1,568,397,607 

11 

89 

199 

28 

10,946 

710,647 

45 

1,134,903,170 

2,537,720,636 

12 

144 

322 

29 

17,711 

1,149,851 

46 

1,836,311,903 

4,106,118,243 

13 

233 

521 

30 

28,657 

1,860,498 

47 

2,971,215,073 

6,643,838,879 

14 

377 

843 

31 

46,368 

3,010,349 

48 

4,807,526,976 

10,749,957,122 

15 

610 

1,364 

32 

75,025 

4,870,847 

49 

7,778,742,049 

17,393,796,001 

16 

987 

2,207 

33 

121,393 

7,881,196 

50 

12,586,269,025 

28,143,753,123 


5. Many properties of the Fibonacci numbers were derived by F. Lucas, who also is 
responsible for naming them the “Fibonacci” numbers. 

6. Binet form (Jacques Binet, 1786-1856): If a = |(1 + \/5) and (3 = |(1 — \/5) then 

a rL — f3 n a n —/3 n 7 ? , ot n 

Fn ~ y/5 ~ OL-P ’ V5‘ 

Also, 

L n = a n + P n , L n ~ a n . 

7. F n = \{F n _ 2 + F n+ i) for all n > 2. That is, each Fibonacci number is the average 
of the terms occurring two places before and one place after it in the sequence. 

8. L n = i(L n _ 2 + L n+ 1) for all n > 2. That is, each Lucas number is the average of 
the terms occurring two places before and one place after it in the sequence. 

9. F 0 + Fi + F 2 + ■ ■ ■ + F n = F n+ 2 — 1 for all n > 0. 

10 . F 0 -F 1 + F 2 + (— 1 ) n F n = (-1 )"F„_r - 1 for all n > 1. 

11. Fi + F3 + F5 + • • • + F 2 n — 1 = F 2n for all n > 1. 

12. Fo + F2 + F4 + • • • + F2 n = Fin+1 — 1 for all n > 0. 

13. Fq + Ft + F| + • • • + F% = F n F n+1 for all n > 0. 

14. FiFi + F 2 F 3 + F3F4 H h F 2n _\F 2n = F% n for all n > 1. 

15. F\F 2 + F 2 F 3 + F3F4 + • • • + F 2n F 2n+ 1 = F% n+1 — 1 for all n > 1. 

16. If k > 1 then F n+ k = FkF n+ 1 + Fk-\F n for all n > 0. 

17. Cassini’s Identity: F n+ \F n -i— F^ = (— l) n for all n > 1. (Jean Dominique Cassini, 
1625-1712) 

18. F^ +1 + F 1 ^ = F 2 „+i for all n > 0. 

19. F^ +2 - F^ +1 = F„Fn +3 for all n > 0. 

20. F^ +2 — F^ = F 2 „ +2 for all n > 0. 

21 - F n+2 + -^n+l ~ F n = F 3n+3 for all n > 0. 


© 2000 by CRC Press LLC 




22 . gcd (F n ,F m ) = Fgcd(n,m ) • This implies that F n and F n+ \ are relatively prime, and 
that Ffc divides F n k- 

23 . Fibonacci numbers arise as sums of diagonals in Pascal’s triangle (§2.3.2): 

L"/2J 

F n+i = E J ) for all n > 0. 

3 = 0 
n 

24 . F 3n = E (")2^ for all n > 0. 

3=0 

25 . The Fibonacci sequence has the generating function 1 _^_ 2 ■ (See §3.1.1.) 

26 . Fibonacci numbers with negative indices can be defined using the recursive defini- 
tion F n _ 2 = F n - F„_i. Then F_„ = (-1 ) n ~ 1 F n , n> 1. 

27 . The units digits of the Fibonacci numbers form a sequence that repeats after 60 
terms. (Joseph Lagrange, 1736-1813) 

28 . The number of binary strings of length n that contain no consecutive Os is counted 
by F n+ 2 - (See §3.3.2, Example 12.) 

29 . Fq + F 4 + L 2 — F * • * — F F n = F n _j_ 2 — 1 for all n F 0. 

30 . Lq F^ + F 2 -I - • • • -I - F^ = L n L n+ i J- 2 for all n F 0. 

31 . L n = F n _ 1 + iy i+ i, n > 1. Hence, any formula containing Lucas numbers can be 
translated into a formula involving Fibonacci numbers. 

32 . The Lucas sequence F 0 , Fi, F 2 , . . . has generating function 1 _EE 2 • (^ ee §3.1.1.) 

33 . Lucas numbers with negative indices can be defined by extending the recursive 
definition. Then F_„ = (— 1)”F„ for all n > 1. 

34 . F n = L ^-^+ L n+i , n > x. Hence, any formula involving Fibonacci numbers can be 
translated into a formula involving Lucas numbers. 

35 . If G 0 , Gi, ... is a sequence of generalized Fibonacci numbers, then F n = F n _iGo + 
F n G\ for all n > 1. 

Examples: 

1. The Fibonacci number Fg can be computed, using the initial values Fo = 0 and 
Fj = 1 and the recurrence relation F„ = F n _i + F n _ 2 repeatedly: F 2 = F 3 + F 0 = 
1 + 0 = 1, F 3 = F 2 + F l = 1 + 1 = 2, F 4 = F 3 + F 2 = 2 + 1 = 3, F 5 = F 4 + F 3 = 3 + 2 = 5, 
F 6 = F 5 + F 4 = 5 + 3 = 8, F 7 = F 6 + F 5 = 8 + 5 = 13, F 8 = F 7 + F 6 = 13 + 8 = 21. 

2 . Each male bee (drone) is produced asexually from a female, whereas each female 
bee is produced from both a male and female. The ancestral tree for a single male bee 
is shown below. This male has one parent, two grandparents, three great grandparents, 
and in general Ffc +2 /cth-order grandparents, k > 0. 


M F 

\/ 


F 


F M 


I \/ 

M F 

\/ 


F 


M F 


M 
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3. Rabbit breeding: This problem was originally posed by Fibonacci. A single pair 
of immature rabbits is introduced into a habitat. It takes two months before a pair of 
rabbits can breed; each month thereafter each pair of breeding rabbits produces another 
pair. At the start of months 1 and 2, only the original pair A is present. In the third 
month, A as well as their newly born pair B are present; in the fourth month, A, B 
as well as the new pair C (progeny of A) are present; in the fifth month, A , B, C as 
well as the new pairs D (progeny of A) and E (progeny of B) are present. If P n is the 
number of pairs present in month n, then Pi = 1, P 2 = 1, P 3 = 2, P4 = 3, P5 = 5. In 
general, P n equals the number present in the previous month P„_i plus the number of 
breeding pairs in the previous month (which is P„_ 2 , the number present two months 
earlier). Thus P n = F n for n > 1. 

4. Let S n denote the number of subsets of {1, 2, . . . , n} that do not contain consecutive 
elements. For example, when n = 3 the allowable subsets are 0, {1}, {2}, {3}, {1, 3}. 
Therefore, S 3 = 5. In general, S n = F n+ 2 for n > 1. 

5. Draw n dots in a line. If each domino can cover exactly two such dots, in how 
many ways can (nonoverlapping) dominoes be placed? The following figure shows the 
number of possible solutions for n = 2,3,4. To find a general expression for D n , the 
number of possible placements of dominoes with n dots, consider the rightmost dot in 
any such placement P. If this dot is not covered by a domino, then P minus the last 
dot determines a solution counted by P„_i. If the last dot is covered by a domino, then 
the last two dots in P are covered by this domino. Removing this rightmost domino 
then gives a solution counted by 2 . Taking into account these two possibilities 
D n = D n _ 1 + £>„_ 2 for n > 3 with D\ = 1, D 2 = 2. Thus D n = F n+ \ for n > 1. 

n = 2 • • |« >| 

n = 3 • • • [T^] • • [SS\ 

n = 4 • • • • [«~*1 • • • E~~il • 

• •EZ] E3E3 

6. Compositions: Let T n be the number of ordered compositions (§2.5.1) of the positive 
integer n into summands that are odd. For example, 4 = l + 3 = 3+ l = l + l + l + l 
and 5 = 5 = l + l + 3 = l + 3 + l = 3 + l + l = l + l + l + l + l. Therefore, P 4 = 3 
and P5 = 5. In general, T n = F n for n > 1. 

7. Compositions: Let B n be the number of ordered compositions (§2.5.1) of the positive 
integer n into summands that are either 1 or 2. For example, 3 = 1 + 2 = 2 + 1 = 1 + 1 + 1 
and 4 = 2 + 2 = l + l + 2 = l + 2 + l = 2 + l + l = l + l + l + l. Therefore, P 3 = 3 
and P 4 = 5. In general, B n = F n+ 1 for n > 1. 

8. Botany: It has been observed in pine cones (and other botanical structures) that 
the number of rows of scales winding in one direction is a Fibonacci number while the 
number of rows of scales winding in the other direction is an adjacent Fibonacci number. 

9. Continued fractions: The continued fraction 1 + ] = ?, the continued fraction 

1 + Af = | and the continued fraction 1 + 1 1 = § . In general, a continued fraction 

i+ T ; + l+7 

composed entirely of Is equals the ratio of successive Fibonacci numbers. 

10. Independent sets on a path: Consider a path graph on vertices 1,2, ... ,n, with 
edges joining vertices i and i + 1 for i = 1,2, . . . , n — 1. An independent set of vertices 
(§8.6.3) consists of vertices no two of which are joined by an edge. By an analysis similar 
to that in Example 5, the number of independent sets in a path graph on n vertices 
equals F n+ 2 . 
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11 . Independent sets on a cycle: Consider a cycle graph on vertices 1,2, ... ,n, with 
edges joining vertices i and i + 1 for i = 1, 2, . . . , n — 1 as well as vertices n and 1. Then 
the number of independent sets (§8.6.3) in a cycle graph on n vertices equals L n . 

12 . Spanning trees: The number of spanning trees of the wheel graph W„ (§8.1.3) 
equals i2n — 2. 

13 . If A is the 2x2 matrix ^ j J ^ , then A n = ^ ^™, +1 p U ^ for n > 1. 


3.1.3 CATALAN NUMBERS 

The sequence of integers called the Catalan numbers arises in counting a variety of 
combinatorial structures, such as voting sequences, certain types of binary trees, paths 
in the plane, and triangulations of polygons. 

Definitions: 

The Catalan numbers Cq , Ci , C§> , . . . satisfy the nonlinear recurrence relation C n = 
CoC' n _i + C , iC'„_2 + - • • + C I j_iC' 0 , n> 1, with C 0 = 1. (See §3.3.1, Example 9.) (Eugene 
Catalan, 1814-1894) 

Well-formed (or balanced) sequences of parentheses of length 2 n are defined 
recursively as follows: the empty sequence is well-formed; if sequence A is well-formed 
so is (A); if sequences A and B are well- formed so is AB. 

Well-parenthesized products of variables are defined recursively as follows: single 
variables are well-parenthesized; if A and B are well-parenthesized so is (AB). 


Facts: 

1. The first 12 Catalan numbers C n are given in the following table. 


n 

012345 6 7 8 

9 

10 

11 

C n 

1 1 2 5 14 42 132 429 1,430 

4,862 

16,796 

58,786 


2 . lim = 4. 

3 . Catalan numbers arise in a variety of applications, such as when binary trees on n 
vertices, triangulations of a convex n-gon, and well-formed sequences of n left and n 
right parentheses are counted. See the examples below as well as [MiRo91]. 

4- C„=^_( 2 ;)foralln> 0. 

5 . The Catalan numbers Cq, C\, Ci, . . . have the generating function 

6. C n ~ ^=- 

V 7T n' 3 

7- ^ = ( 2 D ^ in-i) = (V) - (n+i) all n > 1. 

8. C n+ 1 = 2j2 ^C n for all n > 0. 

Examples: 

1. The number of binary trees (§9.1.2) on n vertices is C n . 

2. The number of left-right binary trees (§9.3.3) on 2n + 1 vertices is C n . 

3 . The number of ordered trees (§9.1.2) on n vertices is C n -\. 
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4 . Suppose that a coin is tossed 2 n times, coming up heads exactly n times and tails 
exactly n times. The number of sequences of tosses in which the cumulative number 
of heads is always at least as large as the cumulative number of tails is C n . For exam- 
ple, when n = 3 there are C3 = 5 such sequences of 6 tosses: HTHTHT, HTHHTT, 
HHTTHT, HHTHTT, HHHTTT. 

5 . In Example 4, the number of sequences of tosses in which the cumulative number of 
heads always exceeds the cumulative number of tails (until the very last toss) is C n _i. 
For example, when n = 3 there are C 2 = 2 such sequences of 6 tosses: HHTHTT, 
HHHTTT. 

6 . Triangulations: Let T n be the number of triangulations of a convex n-gon, using 
n— 3 nonintersecting diagonals. For instance, the following figure shows the T 5 = 5 
triangulations of a pentagon. In general, T n = C n - 2 for n > 3. 



7 . Suppose that 2 n points are placed in fixed positions, evenly distributed on the 
circumference of a circle. Then there are C n ways to join n pairs of the points so that 
the resulting chords do not intersect. The following figure shows the C3 = 5 solutions 
for n = 3. 



8. Well-formed sequences of parentheses: The sequence of parentheses (()()) 
involving three left and three right parentheses is well-formed, whereas the sequence 
())(() is not syntactically meaningful. There are five such well-formed sequences in 
this case: 

()()(), ()(()), (())(), (()()), ((()))• 

Notice that if each left parenthesis is replaced by a H and each right parenthesis by a T, 
then these five balanced sequences correspond exactly to the five coin tossing sequences 
listed in Example 4. In general, the number of balanced sequences involving n left and n 
right parentheses is C n . 

9 . Consider the following procedure composed of n nested for loops: 

count := 0 
for i\ := 1 to 1 

for i 2 := 1 to q + 1 
for 13 := 1 to *2 + 1 


for i n := 1 to « n _ 1 + 1 
count := count + 1 

Then the value of count upon exit from this procedure is C n . 
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10. Well-parenthesized products: The product X1X2X3 (relative to some binary “mul- 
tiplication” operation) can be evaluated as either ( 2 : 10 : 2 ) 2:3 or xi(x 2 X 3 ). In the former, 
X\ and X 2 are first combined and then the result is combined with X 3 . In the latter, 
X 2 and X 3 are first combined and then the result is combined with x\. Let P„ indicate 
the number of ways to evaluate the product X 1 X 2 ■ ■ - X n of n variables, using a binary 
operation. Note that P3 = 2. In general, P n = C n - 1 . This was the problem originally 
studied by Catalan. (See §3.3.1, Example 9.) 

11. The numbers 1,2, . . . ,2n are to be placed in the 2 n positions of an 2 x n array 
A = ( a,ij ). Such an arrangement is monotone if the values increase within each row and 
within each column. Then there are C n ways to form a monotone 2 xn array containing 
the entries 1,2,..., 2 n. For instance, the following is one of the C4 = 14 monotone 2x4 
arrays: 

(1 3 5 6\ 

A y 2 4 7 8 )' 


3.1 .4 BERNOULLI NUMBERS AND POLYNOMIALS 

The Bernoulli numbers are important in obtaining closed form expressions for the sums 
of powers of integers. These numbers also arise in expansions involving other combina- 
torial sequences. 

Definitions: 

n 

The Bernoulli numbers B n satisfy the recurrence relation ( n+1 )Bj = 0 for all 

i= 0 J 

n > 1, with Bq = 1. (Jakob Bernoulli, 1654-1705) 

m 

The Bernoulli polynomials B m (x) are given by B m (x) = )T) (”') B k x m ~ k . 

fc= 0 


Facts: 

1. The first 14 Bernoulli numbers B n are shown in the following table. 


n 

0 

1 

2 

3 

4 

5 

6 

7 

8 9 

10 

11 

12 

13 

B n 

1 

1 

2 

1 

6 

0 

1 

30 

0 

1 

42 

0 

O 

5 

66 

0 

691 

2730 

0 


2. B 2 k+ 1 = 0 for all k > 1. 

3. The nonzero Bernoulli numbers alternate in sign. 

4. B n = B n { 0). 

00 n 

5. The Bernoulli numbers have the exponential generating function B n ^-j = . 

n— 0 

6. The Bernoulli numbers can be expressed in terms of the Stirling subset numbers 

n 

(§2.5.2): B n = E (-1) J '{"} j+T for a11 " > °- 

3 = 0 

7. The Bernoulli numbers appear as coefficients in the Maclaurin expansion of tan x, 
cotx, cscx, tanha:, cothx, and cscha:. 

8. The Bernoulli polynomials can be used to obtain closed form expressions for the 
sum of powers of the first n positive integers. (See §3.5.4.) 

9. The first 14 Bernoulli polynomials B m (x) are shown in Table 5. 

10. B m (x) dx = 0 for all m > 1. 
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Table 5 Bernoulli polynomials. 



11 . dB d} X ' > = i(x) for all m > 1. 

12. B m+ i(x + 1) — B m+ i(x) = (in + l)x m for all m > 0. 

13. The Bernoulli polynomials have the following exponential generating function: 

V B (x)— — $$$- 

m— 0 

3.1.5 EULERIAN NUMBERS 

Eulerian numbers are important in counting numbers of permutations with certain 
numbers of increases and decreases. 

Definitions: 

Let 7r = (7Ti, 7T2, • • • , 7 T n ) be a permutation of {1, 2, . . . , n}. 

An ascent of the permutation 7r is any index i (1 < i < n) such that 7T; < 7T,;+i. A 
descent of the permutation 7r is any index i (1 < i < n) such that 7 t, > 7r,+i. 

An excedance of the permutation 7r is any index i (1 < * < n) such that 7Tj > i. A 
weak excedance of the permutation 7r is any index i (1 < i < n) such that 7 > i. 

The Eulerian number E(n,k) (also written (£)) is the number of permutations of 
{ 1 , 2 ,..., n} with exactly k ascents. 

Facts: 

1. E(n, k) is the number of permutations of {1, 2, . . . , n} with exactly k descents. 

2. E(n, k) is the number of permutations of {1, 2, ... , n} with exactly k excedances. 

3. E(n,k) is the number of permutations of {1,2, ...,n} with exactly k + 1 weak 
excedances. 
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4. The Eulerian numbers can be used to obtain closed form expressions for the sum of 
powers of the first n positive integers (§3.5.4). 

5. Eulerian numbers E(n, k ) (1 < n < 10, 0 < k < 8) are given in the following table. 



6. E(n , 0) = E(n, n — 1) = 1 for all n > 1. 

7. Symmetry: E(n, k) = E(n, n — 1 — k) for all n > 1. 

8. E(n , k) — (k + 1 )E(n — 1, k ) + ( n — k)E{n — 1, k — 1) for all n > 2. 

n— 1 

9. ^2 E(n , fc) = n! for all n > 1. 

/c— 0 

n— 1 

10. Worpitzky’s identity: x n = S(n, fc) for all n > 1. (Julius Worpitzky, 

o 

1835-1895) 

hi 

11. E(n, k) = J2(-l) j ( n+1 ){k + l-j) n for all n > 1. 

3=0 

12. The Bernoulli numbers (§3.1.4) can be expressed as alternating sums of Eulerian 

m — 2 

numbers: 5 m = X] (— l) fc J?(r?z — 1, k) for m > 2. 

^ ’ k=o 

13. The Stirling subset numbers (§2.5.2) can be expressed in terms of the Eulerian 

n — 1 

numbers: = -b V E(n,k)( k ) for n> m and n > 1. 

k—0 

14. The Eulerian numbers have the following (bivariate) generating function in vari- 

00 00 n 

ables x,t: £ £ E{n,m)x m E i = e( ^7u_ x - 

m=0 n=0 


Examples: 


1. The permutation 7r = (7ri,7r2,7r 3 ,7r4) = (1,2, 3, 4) has three ascents since 1 < 2 < 
3 < 4 and it is the only permutation in S4 with three ascents; note that E( 4,3) = 1. 
There are E( 4,1) = 11 permutations in S4 with one ascent: (1,4, 3, 2), (2,1,4, 3), 
(2, 4, 3,1), (3, 1,4, 2), (3, 2, 1,4), (3, 2, 4,1), (3, 4, 2,1), (4, 1,3, 2), (4, 2, 1,3), (4, 2, 3,1), 
and (4, 3, 1,2). 


2. The permutation 7 r = (2,4, 3,1) has two excedances since 2 > 1 and 4 > 2. There 
are E( 4,2) = 11 such permutations in S'4. 
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3. The permutation n = (1,3,2) has two weak excedances since 1 > 1 and 3 > 2. 
There are E( 3, 1) = 4 such permutations in S 3 : (1, 3, 2), (2, 1, 3), (2, 3, 1), and (3, 2, 1). 

4. When n = 3 Worpitzky’s identity (Fact 10) states that 

x 3 = E( 3, 0) (*) + E{ 3, 1) ("+ 1 ) + E( 3, 2) (*+ 2 ) = (*) + 4(*+ 3 ) + (*+ 2 ) . 

This is verified algebraically since (“) + 4( x + 3 ) + ( I j 2 ) = | (x(x — l)(x — 2) + 4(x + 1) 
x(x — 1) + (x + 2)(x + l)x) = | (x 2 — 3x + 2 + 4x 2 — 4 + x 2 + 3x + 2) = | (6x 2 ) = x 3 . 


3.1.6 RAMSEY NUMBERS 

The Ramsey numbers arise from the work of Frank P. Ramsey (1903-1930), who in 1930 
published a paper [Ra30] dealing with set theory that generalized the pigeonhole prin- 
ciple. (Also see §8.11.2.) [GrRoSp80], [MiRo91], [Ro84] 

Definitions: 

The Ramsey number R(m , n) is the smallest positive integer k with the following 
property: if S is a set of size k and the 2-element subsets of S are partitioned into 2 
collections, C\ and C 2 , then there is a subset of S of size m such that each of its 2- 
element subsets belong to C\ or there is a subset of S of size n such that each of its 
2-element sets belong to C 2 . 

The Ramsey number R(nii , . . . , m n \ r) is the smallest positive integer k with the 
following property: if S is a set of size k and the r-element subsets of S are partitioned 
into n collections C\ , , C n , then for some j there is a subset of S of size rrij such that 
each of its r-element subsets belong to C'j . 

The Schur number S(n) is the smallest integer k with the following property: if 
{1, . . . , k} is partitioned into n subsets Ai , . . . , A n , then there is a subset A t such that 
the equation x + y = z has a solution where x, y,z € A t . (Issai Schur, 1875-1941) 

Facts: 

1. Ramsey’s theorem: The Ramsey numbers R{m , n) and R(mi , . . . , m n \ r) are well- 
defined for all to, n > 1 and for all TOi, . . . , m n > 1, r > 1. 

2. Ramsey numbers R(m, n) can be phrased in terms of coloring edges of the complete 
graphs K n : the Ramsey number R(m,n) is the smallest positive integer k such that, if 
each edge of K & is colored red or blue, then either the red subgraph contains a copy of 
K m or else the blue subgraph contains a copy of K n . (See §8.11.2.) 

3. Symmetry: R(m,n) = R(n,m). 

4. i?(m., 1) = R( 1, to) = 1 for every to > 1. 

5. R(to-, 2) = R( 2, to) = to for every to > 1. 

6. The values of few Ramsey numbers are known. What is currently known about 
Ramsey numbers R{m,n ), for 3 < to- < 10 and 3 < n < 10, and bounds on other 
Ramsey numbers are displayed in Table 6. 

7. If mi < to . 2 and ni < 712 , then R{mi,n{) < R(m 2 ,ri 2 ). 

8. R{m, n) < R(m, n — 1) + R(n — 1, to) for all to, n > 2. 

9. If to > 3, n > 3, and if R(m,n — 1) and R{m — 1 , n) are even, then R(m,n ) < 
R(m, n — 1) + R{m — 1, n) — 1. 

10. R(m,n) < (Erdos and Szekeres, 1935) 
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Table 6 Some classical Ramsey numbers. 

The entries in the body of this table are R(m , n) (m, n < 10) when known, or the 
best known range ri < R(m.,n) < ri when not known. The Ramsey numbers R( 3,3), 
f?( 3, 4) , R( 3, 5), and i?(4, 4) were found by A. M. Gleason and R. E. Greenwood in 1955; 
f?( 3,6) was found by J. G. Kalbfleisch in 1966; R( 3,7) was found by J.E. Graver and 
J.Yackel in 1968; f?(3, 8) was found by B. McKay and Z.KeMin; R( 3,9) was found 
by C. M. Grinstead and S. M. Roberts in 1982; R( 4, 5) was found by B. McKay and 
S. Radziszowski in 1993. 


\n 

m\ 

3 4 

5 

6 

7 

8 

9 

10 

3 

6 9 

14 

18 

23 

28 

36 

40-43 

4 

18 

25 

35-41 

49-61 

55-84 

69-115 

80-149 

5 

- 

43-49 

58-87 

80-143 

95-216 

116-316 

141-442 

6 

- 

~ 

102-165 

109-298 

122-495 

153-780 

167-1,171 

7 

- 

- 

- 

205-540 

216-1,031 

227-1,713 

238-2,826 

8 

- 

~ 

- 

- 

282-1,870 

295-3,583 

308-6,090 

9 

- 

~ 

- 

- 

- 

565-6,625 

580-12,715 

10 

- 

~ 

- 

- 

- 

- 

798-23,854 


Bounds for R(m 

n) for m 

= 3 and 4, 

with 11 < n < 15: 



46 < R{ 3,11) < 51 

96 < i?(4, 11) < 191 



52 < R{ 3, 12) < 60 

128 < R( 4, 12) < 238 



59 < R( 3,13) < 69 

131 < R( 4, 13) < 291 



66 < R( 3, 14) < 78 

136 < R( 4, 14) < 349 



73 < R{ 3, 15) < 89 

145 < R( 4, 15) < 417 



11 . The Ramsey numbers 1 ?(to, n) satisfy the following asymptotic relationship: 

^r(l + o(l))m2 m / 2 < R(m, m) < ( 2 ™+ 2 ) • 0((logm) _1 ). 

12. There exist constants Ci and C 2 such that Ciinlnm < l?(3,m) < c^rn In rri. 

13. The problem of finding the Ramsey numbers R(m\, . . . , in n \ 2) can be phrased in 
terms of coloring edges of the complete graphs K n . R{rri\, . . . , rri n ; 2) is equal to the 
smallest positive integer k with the following property: no matter how the edges of Kk 
are colored with the n colors 1, 2, . . . , n, there is some j such that I\k has a subgraph K m . 
of color j. (The edges of Kj- are the 2-element subsets; Cj is the set of edges of color j.) 

14. R(itii 1 m 2 ', 2) = R(mi,m2). 

15. Very little is also known about the numbers R{m \ 1 . . . , ?n„; 2) if n > 3. 

16. R( 2,..., 2; 2) = 2. 

17. If each m* > 3, the only Ramsey number whose value is known is R( 3, 3, 3; 2) = 17. 

18. R(m , r, r, . . . , r; r) = m if m > r. 

19. . . . ?n ra ; 1) = mi + • • • + m n — (n — 1). 

20. Ramsey theory is a generalization of the pigeonhole principle. In the terminology 
of Ramsey numbers, the fact that R( 2, . . . , 2; 1) = n- 1-1 means that n + 1 is the smallest 
positive integer with the property that if S has size n + 1 and the subsets of S are 
partitioned into n sets C \, . . . , C n , then for some j there is a subset of S of size 2 such 
that each of its elements belong to Cj. Hence, some Cj has at least 2 elements. If S 
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is a set of n + 1 pigeons and the subset Cj (j = 1, ... ,n) is the set of pigeons roosting 
in pigeonhole j, then some pigeonhole must have at least 2 pigeons in it. The Ramsey 
numbers R( 2, . . . , 2; 1) give the smallest number of pigeons that force at least 2 to roost 
in the same pigeonhole. 

21 . Schur’s theorem: S(k ) < R( 3, . . . , 3; 2) (where there are k 3s in the notation for 
the Ramsey number). 

22. The following Schur numbers are known: S'(l) = 2, S( 2) = 5, S( 3) = 14. 

23 . The equation x + y = z in the definition of Schur numbers has been generalized to 
equations of the form x\ + ■ ■ ■ + x n -i = x n , n > 4. [BeBr82]. 

24 . Convex sets: Ramsey numbers play a role in constructing convex polygons. Sup- 
pose to is a positive integer and there are n given points, no three of which are collinear. 
If n > R(m, 5; 4), then a convex TO-gon can be obtained from to of the n points [ErSz35]. 
This paper provided the impetus for the study of Ramsey numbers and suggested the 
possibility of its wide applicability in mathematics. 

25 . It remains an unsolved problem to find the smallest integer x (which depends on to) 
such that if n > x, then a convex m-gon can be obtained from to of the n points. 

26 . Extensive information on Ramsey number theory, including bounds on Ramsey 
numbers, can be found at S. Radziszowski’s web site: 

http : //www. cs . rit . edu/~spr/homepage .html 

Examples: 

1. If six people are at a party, then either three of these six are mutual friends or three 
are mutual strangers. If six is replaced by five, the result is not true. These facts follow 
since R( 3, 3) = 6. (See Fact 2. The six people can be regarded as vertices, with a red 
edge joining friends and a blue edge joining strangers.) 

2 . If the set {1, . . . , k} is partitioned into two subsets A\ and A 2 , then the equation 
x + y = z may or may not have a solution where x,y,z £ A\ or x,y, z £ A 2 . If k > 5, 
a solution is guaranteed since S( 2) = 5. If k < 5, no solution is guaranteed take 
Ai = {1,4} and A 2 = {2,3}. 


3.1.7 OTHER SEQUENCES 

Additional sequences that regularly arise in discrete mathematics are described in this 
section. 


> Euler Polynomials 
Definition: 

OO 

The Euler polynomials E n (x) have the exponential generating function J} E n (x)Ef 

n — 0 

_ 2e xt 
e f + l ’ 

Facts: 

1. The first 14 Euler polynomials E n ( x) are shown in Table 7. 

2 . E n (x + 1) + E n (x) = 2x n for all n > 0. 

3 . The Euler polynomials can be expressed in terms of the Bernoulli numbers (§3.1.4): 

n 

En- i(x) 4E(2- 2 k+1 )(l)B k x n ~ k for all n > 1. 

k = 1 
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Table 7 Euler polynomials. 


n 

E n (x) 

0 

1 

1 

x ~\ 

2 

2 

x — X 

3 

t .3 _ 3 2 i 1 

«4/ 2 1 ^ 

4 

x 4 — 2a; 3 + x 

5 

0.5 _ 5 4 , 5 2 _ 1 

2 ^ I 2 ^ 2 

6 

x 6 — 3a; 5 + 5a; 3 — 3a: 

7 

0.7 _ 7^.6 , 35^,4 _ 21 2 , 17 

•4/ 2 I ^ 2 I g 

8 

a: 8 — 4a; 7 + 14a; 5 — 28a; 3 + 17a; 

9 

a: 9 - |a; 8 + 21a; 6 - 63a; 4 + ^x 2 - f 

10 

a: 10 - 5a; 9 + 30a: 7 - 126a; 5 + 255a; 3 - 155a; 

11 

2,11 _ ll^lO _|_ 165 2.8 _ 231 x 6 + 28 4 05 x 4 — 1705 x 2 + 2|1 

12 

a; 12 - 6a; 11 + 55a; 9 - 396a; 7 + 1683a; 5 - 3410a: 3 + 2073a; 

13 

0.13 13 0.12 , 143 10 1287 8 , 7293 6 22165 4 , 26949 2 5461 

*4/ 2 x 1 2 2 x 1 2 2 1 2 x 2 


4. The alternating sum of powers of the first n integers can be expressed in terms of 

n 

the Euler polynomials: J2 (— T) n ~i j k = \ [Ek(n + 1) + (— l)™Efc(0)] . 

j = i 


▻ Euler and Tangent Numbers 
Definitions: 

The Euler numbers E n are given by E n = 2 n E n (^), where E n (x) is an Euler polyno- 
mial. 

00 n 

The tangent numbers T n have the exponential generating function tana:: ^2 T n - y = 

71=0 

tan x. 

Facts: 

1. The first twelve Euler numbers E n and tangent numbers T n are shown in the fol- 
lowing table. 


n 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

E n 

1 

0 

-1 0 

5 

0 

-61 

0 

1,385 

0 

-50,521 

0 

T n 

0 

1 

0 

2 

0 

16 

0 

272 

0 

7,936 

0 

353,792 


2. E 2 k+i = T 2 k = 0 for all k > 0. 

3. The nonzero Euler numbers alternate in sign. 
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4. The Euler numbers have the exponential generating function ef+ 2 e _ t = secht. 

5. The exponential generating function for \E n \ is I = sect. 

6 . The tangent numbers can be expressed in terms of the Bernoulli numbers (§3.1.4): 
T 2n _! = (_i)n— 1 4 "( 4 "— 1 ] B 2n for all n > 1. 

7. The tangent numbers can be expressed as an alternating sum of Eulerian numbers 

2 n 

(§3.1.5): T 2n+ 1 = X] {-l) n ~ k E{2n + 1, k) for all n > 0. 

k - 0 

8 . (— l) n E2n counts the number of alternating permutations in .S' 2n : that is, the number 
of permutations tt = (774, 7t 2) . . . , 7r 2n ) on {1, 2, . . . , 2n} with 7rr > 7r 2 < 7r 3 > 774 < • • • > 
7T 2 n- 

9. T 2n+ i counts the number of alternating permutations in S' 2 n+ i . 

Examples: 

1. The permutation 7 r = ( 774 , 7 t 2 , 77 3 , 774 ) = (2, 1, 4, 3) is alternating since 2 > 1 < 4 > 3. 
In all there are (— 1) 2 £4 = 5 alternating permutations in S 4 : (2, 1,4,3), (3,1,4, 2), 
(3, 2, 4,1), (4, 1,3, 2), (4, 2, 3,1). 

2 . The permutation 7 r = (774, 7r 2 , 7 t 3 , 714 , 775 ) = (4, 1,3, 2 , 5) is alternating since 4 > 1 < 
3 > 2 < 5. In all there are X 5 = 16 alternating permutations in S 5 . 


> Harmonic Numbers 


Definition: 

The harmonic numbers H n are given by = E” = i I f° r n — with H 0 = 0. 

Facts: 

1. H n is the discrete analogue of the natural logarithm (§3.4.1). 

2. The first twelve harmonic numbers H n are shown in the following table. 


n 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

H n 

0 

1 

3 

2 

11 

6 

25 

12 

137 

60 

49 

20 

363 

140 

761 

280 

7,129 

2,520 

7,381 

2,520 

83,711 

27,720 


3. The harmonic numbers can be expressed in terms of the Stirling cycle numbers 
(§2.5.2): H n = ±[ n + 1 ], n>l. 

n 

4. ^2 Hi = (n + 1) [H n + 1 — l] for all n > 1. 

2=1 

n 

5. E iHi = ("+ 1 ) [H n +i - l] for all n > 1. 

2=1 

n 

6 . E a )Hi = a+i) [^+i - sir] for all n > 1 . 

i=l 

7. H n — » oo as n — > oo. 

8 . ~ Inn + 7 + ^ — 4^2 + 12 q„ 4 ? where 7 ~ 0.57721 56649 01533 denotes Euler’s 
constant. 

9. The harmonic numbers have the generating function l n 4 X 7 • 
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Example: 

1. Fact 8 yields the approximation Hiq « 2.928968257896. The actual value is H\q = 
2.928968253968 . . . , so the approximation is accurate to 9 significant digits. The ap- 
proximation H w « 3.597739657206 is accurate to 10 digits, and the approximation 
7/40 « 4.27854303893 is accurate to 12 digits. 


> Gray Codes 
Definition: 

A Gray code of size n is an ordering G n = (gi, g 2 , ■ ■ ■ , 52 ") of the 2 n binary strings of 
length n such that gk and gk+i differ in exactly one bit, for 1 < k < 2 n . Usually it is 
required that g 2 n and gi also differ in exactly one bit. 

Facts: 

1. Gray codes exist for all n > 1. Sample Gray codes G n are shown in this table. 


n 

G n 

1 

0 

1 








2 

00 

10 

11 

01 






3 

000 

100 

110 

010 

Oil 

111 

101 

001 


4 

0000 

1000 

1100 

0100 

0110 

1110 

1010 

0010 

0011 


1011 

1111 

0111 

0101 

1101 

1001 

0001 



5 

00000 

10000 

11000 

01000 

01100 

11100 

10100 

00100 

00110 


10110 

11110 

01110 

01010 

11010 

10010 

00010 

00011 

10011 


11011 

01011 

01111 

11111 

10111 

00111 

00101 

10101 

11101 


01101 

01001 

11001 

10001 

00001 






2 . A Gray code of size n> 2 corresponds to a Hamilton cycle in the n-cube (§8.4.4). 

3 . Gray codes correspond to an ordering of all subsets of {1,2,..., n} such that adja- 
cent subsets differ by the insertion or deletion of exactly one element. Each subset A 
corresponds to a binary string a\a 2 ...a„ where a,; = 1 if i G A, a,i = 0 if i £A. 

4 . A Gray code G n can be recursively obtained in the following way: 

• first half of G n : Add a 0 to the end of each string in G n ~ 1 . 

• second half of G n : Add a 1 to each string in the reversal of the sequence G n ~ 1 . 


> deBruijn Sequences 
Definitions: 

A (p, n) deBruijn sequence on the alphabet E = {0, 1, . . . ,p — 1} is a sequence 
(so, si, . . . , Sl-i) of L = p n elements S; £ E such that each consecutive subsequence 
(si, s,+i, . . . , Si+„_i) of length n is distinct. Here the addition of subscripts is done mod- 
ulo L so that the sequence is considered as a circular ordering. (Nicolaas G. deBruijn, 
born 1918) 

The deBruijn diagram D pn is a directed graph whose vertices correspond to all 
possible strings S 1 S 2 . . . s„_i of n — 1 symbols from E. There are p arcs leaving the 
vertex sis 2 ■ ■ ■ s n -i, each labeled with a distinct symbol ct € E and leading to the 
adjacent node S 2 S 3 . . . s n -\a. 
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Facts: 

1. The deBruijn diagram D pn has p n ~ 1 vertices and p n arcs. 

2 . D Ptn is a strongly connected digraph (§11.3.2). 

3 . D p n is an Eulerian digraph (§11.3.2). 

4 . Any Euler circuit in D pn produces a (p,n) deBruijn sequence. 

5 . deBruijn sequences exist for all p (with n > 1). Sample deBruijn sequences are 
shown in the following table. 


0 P, n) 

a de Bruijn sequence 

(2,1) 

01 

(2,2) 

0110 

(2,3) 

01110100 

(2,4) 

0101001101111000 

(3,2) 

012202110 

(3,3) 

012001110100022212202112102 

(4,2) 

0113102212033230 


6. A deBruijn sequence can be generated from an alphabet E = {0, 1, . . . ,p — 1} of p 
symbols using Algorithm 1. 


Algorithm 1: Generating a ( p,n ) deBruijn sequence. 

1. Start with the sequence S containing n zeros. 

2. Append the largest symbol from E to S so that the newly formed sequence 
S' of n symbols does not already appear as a subsequence of S. Let S = S'. 

3. Repeat Step 2 as long as possible. 

4. When Step 2 cannot be applied, remove the last n — 1 symbols from S. 


Example: 

1. The deBruijn diagram D 2.3 is shown in the following figure. An Eulerian circuit is 
obtained by visiting in order the vertices 11,10,01,10,00,00,01,11,11. The deBruijn 
sequence 01000111 is obtained by reading off the edge labels a as this circuit is traversed. 

01 



> Self-generating Sequences 
Definition: 

Some unusual sequences defined by simple recurrence relations or rules are informally 
called self-generating sequences. 
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Examples: 

1. Hofstadter G-sequence: This sequence is defined by a(n) = n — a(a(n — 1)), with 
initial condition a(0) = 0. The initial terms of this sequence are 0, 1, 1, 2, 3, 3, 4, 4, 5, 
6, 7, 8, 8, 9, 9, 10, .... It is easy to show this sequence is well-defined. A formula for 
the nth term of this sequence is a(n ) = |_( n + l)/q|, where /i = (—1 + v / 5)/2. [Ho79] 

2. Variations of the Hofstader G-sequence about which little is known: These include 
the sequence defined by a(n) = n — a(a(a(n — 1))) with a(0) = 1, whose initial terms are 
0, 1, 1, 2, 3, 4, 4, 5, 5, 6, 7, 7, 8, 9, 10, 10, 11, 12, 13, . . . and the sequence defined by a(n ) = 
n — a(a(a(a(n — 1))) with a(0) = 1, whose initial terms are 0, 1, 1, 2, 3, 4, 5, 5, 6, 6, 7, 
8, 8, 9, 10, 11, 11, 12, 13, 14, .... 

3. The sequence a(n) = a(n — a(n — 1)) + a(n — a(n — 2)), with o(0) = a(l) = 1, was 
also defined by Hofstader. The initial terms of this sequence are 1, 1, 2, 3, 3, 4, 5, 5, 6, 
6, 6, 8, 8, 8, 10, 10, 10, 12, ... . 

4. The intertwined sequence F(n) and M(n) are defined by F(n) = n — F{M(n — 1)) 
and M(n) = n — M(F(n — 1)), with initial conditions F( 0) = 1 and M( 0) = 0. The 
initial terms of the sequence F(n) (sometimes called the “female” sequence of the pair) 
begins with the terms 1,1,2, 2, 3, 3, 4, 5, 5, 6, 6, 7, 8, 8, 9, 9, 10, . . . and the initial terms of 
the sequence M (n) (sometimes called the “male” sequence of the pair) begins with the 
terms 0, 0, 1, 2, 2, 3, 4, 4, 5, 6, 6, 7, 7, 8 , 9, 9, 10, ... . 

5. Golomb’s self-generating sequence: This sequence is the unique nondecreasing se- 
quence 01,02,03 ,... with the property that it contains exactly a*, occurrences of the 
integer k for each integer k. The initial terms of this sequence are 1, 2, 2, 3, 3, 4, 4, 4, 

5, 5, 5, 6, 6, 6, 6, .... 

6. If /(n) is the largest integer m such that a m = n where a*, is the /cth term of 

Golomb’s self-generating sequence, then f(n) = a k and /(/(«)) = J2k = i & a k ■ 


3.1 .8 MINIGUIDE TO SEQUENCES 

This section lists the numerical values of various integer sequences, classified according 
to the type of combinatorial structure that produces the terms. This listing supple- 
ments many of the tables presented in this Handbook. A comprehensive tabulation of 
over 5,400 integer sequences is provided in [S1P195], arranged in lexicographic order. 
(See Fact 4.) 

Definitions: 

The power sum S k (n) = is the sum of the fcth powers of the first n positive 

integers. The sum of the kth powers of the first n odd integers is denoted O k (n ) = 
E” =1 (2 j^l) fc . 

The associated Stirling number of the first kind d(n, k) is the number of fc-cycle 
permutations of an n-element set with all cycles of length > 2. 

The associated Stirling number of the second kind b(n, k) is the number of fc-block 
partitions of an n-element set with all blocks of size > 2. 

The double factorial n\\ is the product n(n — 2) ... 6 • 4 • 2 if n is an even positive 
integer and n(n — 2) ... 5 • 3 • 1 if n is an odd positive integer. 

The Lah coefficients L(n, k) are the coefficients of x k (§3.4.2) resulting from the 
expansion of x n (§3.4.2): 

x n = J2 L(n, k) x-. 

fc= l 
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A permutation n is discordant from a set A of permutations when 7 r(i) 7^ a(i) for 
all i and all a £ A. Usually A consists of the identity permutation 1 and powers of the 
n-cycle cr n = (1 2 ... n) (see §5.3.1). 

A necklace with n beads in c colors corresponds to an equivalence class of functions 
from an n-set to a c-set, under cyclic or dihedral equivalence. 

A dgurate number is the number of cells in an array of cells bounded by some regular 
geometrical figure. 

A polyomino with p polygons (cells) is a connected configuration of p regular polygons 
in the plane. The polygons usually considered are either triangles, squares, or hexagons. 

Facts: 

1. Each entry in the following miniguide lists initial terms of the sequence, provides a 
brief description, and gives the reference number used in [SIP 195]. 

2. On-line sequence server: Sequences can be submitted for identification by e-mail to 

sequences@research . att . com 

for lookup on N. J. A. Sloane’s The On-Line Encyclopedia of Integer Sequences. Sending 
the word lookup followed by several initial terms of the sequence, each separated by a 
space but with no commas, will return up to ten matches together with references. 

3. A more powerful sequence server is located at superseekerOresearch . att . com. 
It tries several algorithms to explain a sequence not found in the table. Requests are 
limited to one per person per hour. 

4. World Wide Web page: Sequences can also be accessed and identified using Sloane’s 
web page: 

http : //www. research . att . com/^njas/ sequences 
The entire table of sequences is also accessible from this web page. 

Examples: 

1. The following initial five terms of an unknown sequence were sent to the e-mail 
sequence server at sequences@research.att.com 

lookup 1 2 6 20 70 

In this case one matching sequence M1645 was identified, corresponding to the central 
binomial coefficients ( ") . 

2. After connecting to the web site in Fact 4 and selecting the option “to look up a 
sequence in the table,” a data entry box appears. The initial terms 1 1 2 3 5 8 13 21 
were entered into this field and the request was submitted, producing in this case six 
matching sequences. One of these was the Fibonacci sequence (M0692), another was 
the sequence a n = [e 2 ^ 1 (M0693). 

Miniguide to Sequences from Discrete Mathematics 

The following miniguide contains a selection of important sequences, grouped by func- 
tional problem area (such as graph theory, algebra, number theory). The sequences are 
listed in a logical, rather than lexicographic, order within each identifiable grouping. 
This listing supplements existing tables within the Handbook. References to appropri- 
ate sections of the Handbook are also provided. The notation “Mxxxx” is the reference 
number used in [S1P195] . 
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Powers of Integers (§3.1.1, §3.5.4) 

1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384, 32768, 65536, 131072 

2 n [Ml 129] 

1, 3, 9, 27, 81, 243, 729, 2187, 6561, 19683, 59049, 177147, 531441, 1594323, 4782969 

3 n [M2807] 

1, 4, 16, 64, 256, 1024, 4096, 16384, 65536, 262144, 1048576, 4194304, 16777216, 67108864 

4 n [M3518] 

1, 5, 25, 125, 625, 3125, 15625, 78125, 390625, 1953125, 9765625, 48828125, 244140625 

5 n [M3937] 

1, 6, 36, 216, 1296, 7776, 46656, 279936, 1679616, 10077696, 60466176, 362797056 

6 n [M4224] 

1, 7, 49, 343, 2401, 16807, 117649, 823543, 5764801, 40353607, 282475249, 1977326743 

7 n [M4431] 

1, 8, 64, 512, 4096, 32768, 262144, 2097152, 16777216, 134217728, 1073741824, 8589934592 

8 n [M4555] 


1, 9, 81, 729, 6561, 59049, 531441, 4782969, 43046721, 387420489, 3486784401 


9 n [M4653] 


1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484 

n 2 [M3356] 

1, 8, 27, 64, 125, 216, 343, 512, 729, 1000, 1331, 1728, 2197, 2744, 3375, 4096, 4913, 5832 

n 3 [M4499] 

1, 16, 81, 256, 625, 1296, 2401, 4096, 6561, 1000014641, 20736, 28561, 38416, 50625, 65536 

n 4 [M5004] 

1, 32, 243, 1024, 3125, 7776, 16807, 32768, 59049, 100000, 161051, 248832, 371293, 537824 

n 5 [M5231] 

1, 64, 729, 4096, 15625, 46656, 117649, 262144, 531441, 1000000, 1771561, 2985984 

n 6 [M5330] 

1, 128, 2187, 16384, 78125, 279936, 823543, 2097152, 4782969, 10000000, 19487171 

n 7 [M5392] 

1, 256, 6561, 65536, 390625, 1679616, 5764801, 16777216, 43046721, 100000000, 214358881 

n 8 [M5426] 

1, 512, 19683, 262144, 1953125, 10077696, 40353607, 134217728, 387420489, 1000000000 

n 9 [M5459] 

1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 66, 78, 91, 105, 120, 136, 153, 171, 190, 210, 231, 253, 276 

S 1 (n) [M2535] 
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1, 5, 14, 30, 55, 91, 140, 204, 285, 385, 506, 650, 819, 1015, 1240, 1496, 1785, 2109, 2470, 2870 

S 2 (n) [M3844] 

1, 9, 36, 100, 225, 441, 784, 1296, 2025, 3025, 4356, 6084, 8281, 11025, 14400, 18496, 23409 

S 3 (n) [M4619] 

1, 17, 98, 354, 979, 2275, 4676, 8772, 15333, 25333, 39974, 60710, 89271, 127687, 178312 

S 4 (n) [M5043] 

1, 33, 276, 1300, 4425, 12201, 29008, 61776, 120825, 220825, 381876, 630708, 1002001 

S 5 (n) [M5241] 

1, 65, 794, 4890, 20515, 67171, 184820, 446964, 978405, 1978405, 3749966, 6735950 

S 6 (n) [M5335] 

1, 129, 2316, 18700, 96825, 376761, 1200304, 3297456, 8080425, 18080425, 37567596 

S 7 (n) [M5394] 

1, 257, 6818, 72354, 462979, 2142595, 7907396, 24684612, 67731333, 167731333, 382090214 

S 8 (n) [M5427] 

1, 512, 19683, 262144, 1953125, 10077696, 40353607, 134217728, 387420489, 1000000000 

S 9 (n) [M5459] 

3, 6, 14, 36, 98, 276, 794, 2316, 6818, 20196, 60074, 179196, 535538, 1602516, 4799354 

S n (3) [M2580] 

4, 10, 30, 100, 354, 1300, 4890, 18700, 72354, 282340, 1108650, 4373500, 17312754 

S n (4) [M3397] 

5, 15, 55, 225, 979, 4425, 20515, 96825, 462979, 2235465, 10874275, 53201625, 261453379 

S n (5) [M3863] 

6, 21, 91, 441, 2275, 12201, 67171, 376761, 2142595, 12313161, 71340451, 415998681 

S n (6) [M4149] 

7, 28, 140, 784, 4676, 29008, 184820, 1200304, 7907396, 52666768, 353815700, 2393325424 

S n (7) [M4393] 

8, 36, 204, 1296, 8772, 61776, 446964, 3297456, 24684612, 186884496, 1427557524 

S n (8) [M4520] 

9, 45, 285, 2025, 15333, 120825, 978405, 8080425, 67731333, 574304985, 4914341925 

S n (9) [M4627] 

1, 5, 32, 288, 3413, 50069, 873612, 17650828, 405071317, 10405071317, 295716741928 

S n (n) [M3968] 

1, 28, 153, 496, 1225, 2556, 4753, 8128, 13041, 19900, 29161, 41328, 56953, 76636, 101025 

O a (n) [M5199] 
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1, 82, 707, 3108, 9669, 24310, 52871, 103496, 187017, 317338, 511819, 791660, 1182285 

0 4 (n) [M5359] 

1, 244, 3369, 20176, 79225, 240276, 611569, 1370944, 2790801, 5266900, 9351001, 15787344 

O s (n) [M5421] 


Factorial Numbers 

1, 1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800, 39916800, 479001600, 6227020800 

n! [M1675] 

1, 4, 36, 576, 14400, 518400, 25401600, 1625702400, 131681894400, 13168189440000 

(n!) 2 [M3666] 

2, 3, 8, 30, 144, 840, 5760, 45360, 403200, 3991680, 43545600, 518918400, 6706022400 

n! + (n- 1)! [M0890] 

1, 2, 8, 48, 384, 3840, 46080, 645120, 10321920, 185794560, 3715891200, 81749606400 

n! ! , n even [M1878] 

1, 1, 3, 15, 105, 945, 10395, 135135, 2027025, 34459425, 654729075, 13749310575 

n! ! , n odd [M3002] 


1, 1, 2, 12, 288, 34560, 24883200, 125411328000, 5056584744960000 

product of n factorials [M2049] 

1, 2, 6, 30, 210, 2310, 30030, 510510, 9699690, 223092870, 6469693230, 200560490130 

product of first n primes [M1691] 


Binomial Coefficients (§2.3.2) 

1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 66, 78, 91, 105, 120, 136, 153, 171, 190, 210, 231, 253, 276 

(a) [M2535] 

1, 4, 10, 20, 35, 56, 84, 120, 165, 220, 286, 364, 455, 560, 680, 816, 969, 1140, 1330, 1540, 1771 

(g) [M3382] 

1, 5, 15, 35, 70, 126, 210, 330, 495, 715, 1001, 1365, 1820, 2380, 3060, 3876, 4845, 5985, 7315 

® [M3853] 

1, 6, 21, 56, 126, 252, 462, 792, 1287, 2002, 3003, 4368, 6188, 8568, 11628, 15504, 20349 

( 5 ) [M4142] 

1, 7, 28, 84, 210, 462, 924, 1716, 3003, 5005, 8008, 12376, 18564, 27132, 38760, 54264, 74613 

(g) [M4390] 

1, 8, 36, 120, 330, 792, 1716, 3432, 6435, 11440, 19448, 31824, 50388, 77520, 116280, 170544 

(") [M4517] 

1, 9, 45, 165, 495, 1287, 3003, 6435, 12870, 24310, 43758, 75582, 125970, 203490, 319770 

(s) [M4626] 
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1, 10, 55, 220, 715, 2002, 5005, 11440, 24310, 48620, 92378, 167960, 293930, 497420, 817190 

(g) [M4712] 

1, 11, 66, 286, 1001, 3003, 8008, 19448, 43758, 92378, 184756, 352716, 646646, 1144066 

(io) [ M4 794] 

1, 2, 3, 6, 10, 20, 35, 70, 126, 252, 462, 924, 1716, 3432, 6435, 12870, 24310, 48620, 92378 

central binomial coefficients (^ n " 2 j) [M0769] 

1, 2, 6, 20, 70, 252, 924, 3432, 12870, 48620, 184756, 705432, 2704156, 10400600, 40116600 

central binomial coefficients ( 2 ^“) [M1645] 

1, 3, 10, 35, 126, 462, 1716, 6435, 24310, 92378, 352716, 1352078, 5200300, 20058300 

( 2n n +1 ) [M2848] 

Stirling Cycle Numbers/Stirling Numbers of the First Kind (§2.5.2) 

1, 3, 11, 50, 274, 1764, 13068, 109584, 1026576, 10628640, 120543840, 1486442880 

["] [M2902] 

1, 6, 35, 225, 1624, 13132, 118124, 1172700, 12753576, 150917976, 1931559552 

[ 3 ] [M4218] 

1, 10, 85, 735, 6769, 67284, 723680, 8409500, 105258076, 1414014888, 20313753096 

["] [M4730] 

1, 15, 175, 1960, 22449, 269325, 3416930, 45995730, 657206836, 9957703756, 159721605680 

[ 5 ] [M4983] 

1, 21, 322, 4536, 63273, 902055, 13339535, 206070150, 3336118786, 56663366760 

[e] [M5H4] 

1, 28, 546, 9450, 157773, 2637558, 44990231, 790943153, 14409322928, 272803210680 

["] [M5202] 

2, 11, 35, 85, 175, 322, 546, 870, 1320, 1925, 2717, 3731, 5005, 6580, 8500, 10812, 13566 

[ n n 2 ] [M1998] 

6, 50, 225, 735, 1960, 4536, 9450, 18150, 32670, 55770, 91091, 143325, 218400, 323680 

[„ n 3 ] [M4258] 

24, 274, 1624, 6769, 22449, 63273, 157773, 357423, 749463, 1474473, 2749747, 4899622 

[ n n J [M5155] 


Stirling Subset Numbers/Stirling Numbers of the Second Kind (§2.5.2) 

1, 6, 25, 90, 301, 966, 3025, 9330, 28501, 86526, 261625, 788970, 2375101, 7141686 

{ 3 } [M4167] 

1, 10, 65, 350, 1701, 7770, 34105, 145750, 611501, 2532530, 10391745, 42355950, 171798901 

{"} [M4722] 
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1, 15, 140, 1050, 6951, 42525, 246730, 1379400, 7508501, 40075035, 210766920, 1096190550 

{"} [M4981] 

1, 21, 266, 2646, 22827, 179487, 1323652, 9321312, 63436373, 420693273, 2734926558 

{"} [M5112] 

1, 28, 462, 5880, 63987, 627396, 5715424, 49329280, 408741333, 3281882604, 25708104786 

{"} [M5201] 

1, 7, 25, 65, 140, 266, 462, 750, 1155, 1705, 2431, 3367, 4550, 6020, 7820, 9996, 12597, 15675 

{ n n 2 } [M4385] 

1, 15, 90, 350, 1050, 2646, 5880, 11880, 22275, 39325, 66066, 106470, 165620, 249900 

{ n— 3 } [ M4974 l 

1, 31, 301, 1701, 6951, 22827, 63987, 159027, 359502, 752752, 1479478, 2757118, 4910178 

{ n n 4 } [M5222] 

1, 1, 3, 7, 25, 90, 350, 1701, 7770, 42525, 246730, 1379400, 9321312, 63436373, 420693273 

maxk {£} [M2690] 

Associated Stirling Numbers of the First Kind (§3.1.8) 

3, 20, 130, 924, 7308, 64224, 623376, 6636960, 76998240, 967524480, 13096736640 

d(n, 2) [M3075] 

15, 210, 2380, 26432, 303660, 3678840, 47324376, 647536032, 9418945536, 145410580224 

d(n, 3) [M4988] 

2, 20, 210, 2520, 34650, 540540, 9459450, 183783600, 3928374450, 91662070500 

d(n, n — 3) [M2124] 

6, 130, 2380, 44100, 866250, 18288270, 416215800, 10199989800, 268438920750 

d(n, n - 4) [M4298] 

1, 120, 7308, 303660, 11098780, 389449060, 13642629000, 486591585480, 17856935296200 

d(2n, n — 2) [M5382] 

1, 24, 924, 26432, 705320, 18858840, 520059540, 14980405440, 453247114320 

d(2n + 1, n — 1) [M5169] 

Associated Stirling Numbers of the Second Kind (§3.1.8) 

3, 10, 25, 56, 119, 246, 501, 1012, 2035, 4082, 8177, 16368, 32751, 65518, 131053, 262124 

b(n, 2) [M2836] 

15, 105, 490, 1918, 6825, 22935, 74316, 235092, 731731, 2252341, 6879678, 20900922 

b(n, 3) [M4978] 

1, 25, 490, 9450, 190575, 4099095, 94594500, 2343240900 

b(2n,n 1) [M5186] 
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1, 56, 1918, 56980, 1636635, 47507460, 1422280860 


b(2n + 1, n — 1) [M5315] 


Lah Coefficients (§3.1.8) 

1, 6, 36, 240, 1800, 15120, 141120, 1451520, 16329600, 199584000, 2634508800 

L(n, 2) [M4225] 

1, 12, 120, 1200, 12600, 141120, 1693440, 21772800, 299376000, 4390848000, 68497228800 

L(n, 3) [M4863] 

1, 20, 300, 4200, 58800, 846720, 12700800, 199584000, 3293136000, 57081024000 

L(n, 4) [M5096] 

1, 30, 630, 11760, 211680, 3810240, 69854400, 1317254400, 25686460800, 519437318400 

L(n, 5) [M5213] 

1, 42, 1176, 28224, 635040, 13970880, 307359360, 6849722880, 155831195520 

L(n, 6) [M5279] 


Eulerian Numbers (§3.1.5) 

1, 4, 11, 26, 57, 120, 247, 502, 1013, 2036, 4083, 8178, 16369, 32752, 65519, 131054, 262125 

E(n, 1) [M3416] 

1, 11, 66, 302, 1191, 4293, 14608, 47840, 152637, 478271, 1479726, 4537314, 13824739 

E(n, 2) [M4795] 

1, 26, 302, 2416, 15619, 88234, 455192, 2203488, 10187685, 45533450, 198410786 

E(n, 3) [M5188] 

1, 57, 1191, 15619, 156190, 1310354, 9738114, 66318474, 423281535, 2571742175 

E(n, 4) [M5317] 

1, 120, 4293, 88234, 1310354, 15724248, 162512286, 1505621508, 12843262863 

E(n, 5) [M5379] 

1, 247, 14608, 455192, 9738114, 162512286, 2275172004, 27971176092, 311387598411 

E(n, 6) [M5422] 

1, 502, 47840, 2203488, 66318474, 1505621508, 27971176092, 447538817472 

E(n, 7) [M5457] 


Other Special Sequences (§3.1) 

1,1,2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181, 6795, 10946, 17711 

Fibonacci numbers, n > 1 [M0692] 

1, 3, 4, 7, 11, 18, 29, 47, 76, 123, 199, 322, 521, 843, 1364, 2207, 3571, 5778, 9349, 15127 

Lucas numbers, n > 1 [M2341] 
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1,1,2, 5, 14, 42, 132, 429, 1430, 4862, 16796, 58786, 208012, 742900, 2674440, 9694845 

Catalan numbers, n > 0 [M1459] 


1, 3, 11, 25, 137, 49, 363, 761, 7129, 7381, 83711, 86021, 1145993, 1171733, 1195757 

numerators of harmonic numbers, n > 1 [M2885] 

1, 2, 6, 12, 60, 20, 140, 280, 2520, 2520, 27720, 27720, 360360, 360360, 360360, 720720 

denominators of harmonic numbers, n > 1 [M1589] 

1, 1, 1, 1, 1, 5, 691, 7, 3617, 43867, 174611, 854513, 236364091, 8553103, 23749461029 

numerators of Bernoulli numbers |B 2 n |, n > 0 [M4039] 


1, 6, 30, 42, 30, 66, 2730, 6, 510, 798, 330, 138, 2730, 6, 870, 14322, 510, 6, 1919190, 6, 13530 

denominators of Bernoulli numbers |B 2 n |, n > 0 [M4189] 


1, 1, 5, 61, 1385, 50521, 2702765, 199360981, 19391512145, 2404879675441 

Euler numbers |E 2 n |, n>0 [M4019] 


1, 2, 16, 272, 7936, 353792, 22368256, 1903757312, 209865342976, 29088885112832 

tangent numbers T 2n +i, n > 0 [M2096] 


1, 1, 2, 5, 15, 52, 203, 877, 4140, 21147, 115975, 678570, 4213597, 27644437, 190899322 

Bell numbers, n > 0 [M1484] 


Numbers of Certain Algebraic Structures (§1.4, §5.2) 

1,1, 1,2, 1,1, 1,3, 2, 1,1, 2, 1,1, 1,5, 1,2, 1,2, 1,1, 1,3, 2, 1,3, 2, 1,1, 1,7, 1,1, 1,4, 1,1, 1,3 

abelian groups of order n [M0064] 

1,1,1,2,1,2,1,5,2,2,1,5,1,2,1,14,1,5,1,5,2,2,1,15,2,2,5,4,1,4,1,51,1,2,1,14,1,2 

groups of order n [M0098] 

2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 60, 61, 67, 71, 73, 79, 83, 89, 97, 101 

orders of simple groups [M0651] 

60, 168, 360, 504, 660, 1092, 2448, 2520, 3420, 4080, 5616, 6048, 6072, 7800, 7920, 9828 

orders of noncyclic simple groups [M5318] 

1, 1, 2, 5, 16, 63, 318, 2045, 16999, 183231, 2567284, 46749427, 1104891746, 33823827452 

partially ordered sets on n elements [M1495] 

1, 2, 13, 171, 3994, 154303, 9415189, 878222530 

transitive relations on n elements [M2065] 

1, 5, 52, 1522, 145984, 48464496, 56141454464, 229148550030864, 3333310786076963968 

relations on n unlabeled points [M4010] 


1, 2, 1, 2, 3, 6, 9, 18, 30, 56, 99, 186, 335, 630, 1161, 2182, 4080, 7710, 14532, 27594, 52377 

binary irreducible polynomials of degree n [M0116] 
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Permutations (§5.3.1) 

by cycles 

1,1,1, 3, 15, 75, 435, 3045, 24465, 220185, 2200905, 24209955, 290529855, 3776888115 

no 2-cycles [M2991] 

1,1,2, 4, 16, 80, 520, 3640, 29120, 259840, 2598400, 28582400, 343235200, 4462057600 

no 3-cycles [M1295] 

1,1,2, 6, 18, 90, 540, 3780, 31500, 283500, 2835000, 31185000, 372972600, 4848643800 

no 4-cycles [M1635] 

0, 1, 1, 3, 9, 45, 225, 1575, 11025, 99225, 893025, 9823275, 108056025, 1404728325 

no even length cycles [M2824] 


discordant (§2.4.2, §3.1.8) 

1. 0. 1. 2. 9. 44. 265. 1854. 14833. 133496. 1334961. 14684570. 176214841. 2290792932 

derangements, discordant for t [M1937] 

1. 1. 0. 1. 2. 13. 80. 579. 4738. 43387. 439792. 4890741. 59216642. 775596313. 10927434464 

menage numbers, discordant for l and a n [M2062] 

0, 1, 2, 20, 144, 1265, 12072, 126565, 1445100, 17875140, 238282730, 3407118041 

discordant for tr n ,cr^ [M2121] 


by order 

1, 2, 3, 4, 6, 6, 12, 15, 20, 30, 30, 60, 60, 84, 105, 140, 210, 210, 420, 420, 420, 420, 840, 840 

max order [M0537] 

1, 2, 3, 4, 6, 12, 15, 20, 30, 60, 84, 105, 140, 210, 420, 840, 1260, 1540, 2310, 2520, 4620, 5460 

max order [M0577] 

1, 2, 4, 16, 56, 256, 1072, 11264, 78976, 672256, 4653056, 49810432, 433429504, 4448608256 

order a power of 2 [M1293] 

0, 1, 3, 9, 25, 75, 231, 763, 2619, 9495, 35695, 140151, 568503, 2390479, 10349535, 46206735 

order 2 [M2801] 

0, 0, 2, 8, 20, 80, 350, 1232, 5768, 31040, 142010, 776600, 4874012, 27027728, 168369110 

order 3 [M1833] 

0, 0, 0, 6, 30, 180, 840, 5460, 30996, 209160, 1290960, 9753480, 69618120, 571627056 

order 4 [M4206] 


0, 0, 1, 3, 6, 10, 30, 126, 448, 1296, 4140, 17380, 76296, 296088, 1126216, 4940040, 23904000 

odd, order 2 [M2538] 


Necklaces (§2.6) 

1, 2, 3, 4, 6, 8, 14, 20, 36, 60, 108, 188, 352, 632, 1182, 2192, 4116, 7712, 14602, 27596, 52488 

2 colors, n beads [M0564] 
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1, 3, 6, 11, 24, 51, 130, 315, 834, 2195, 5934, 16107, 44368, 122643, 341802, 956635, 2690844 

3 colors, n beads [M2548] 


1, 4, 10, 24, 70, 208, 700, 2344, 8230, 29144, 104968, 381304, 1398500, 5162224, 19175140 

4 colors, n beads [M3390] 

1, 5, 15, 45, 165, 629, 2635, 11165, 48915, 217045, 976887, 4438925, 20346485, 93900245 

5 colors, n beads [M3860] 

Number Theory (§4.2, §4.3) 

2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97, 101, 103 

primes [M0652] 

0,1,2,2,3,3,4,4,4,4,5,5,6,6,6,6,7,7,8,8,8,8,9,9,9,9,9,9,10,10,11,11,11,11,11,11 

number of primes < n [M0256] 

1.1. 1.1. 1.2. 1.1. 1.2. 1.2. 1.2. 2. 1.1. 2. 1.2, 2, 2, 1,2, 1,2, 1,2, 1,3, 1,1, 2, 2, 2, 2, 1,2, 2, 2 

number of distinct primes dividing n [M0056] 

2, 3, 5, 7, 13, 17, 19, 31, 61, 89, 107, 127, 521, 607, 1279, 2203, 2281, 3217, 4253, 4423, 9689 

Mersenne primes [M0672] 

1.1. 1.2. 1.2. 1.3. 2. 2. 1.4. 1.2. 2, 5, 1,4, 1,4, 2, 2, 1,7, 2, 2, 3, 4, 1,5, 1,7, 2, 2, 2, 9, 1,2, 2, 7 

number of ways of factoring n [M0095] 

1,1,2, 2, 4, 2, 6, 4, 6, 4, 10, 4, 12, 6, 8, 8, 16, 6, 18, 8, 12, 10, 22, 8, 20, 12, 18, 12, 28, 8, 30, 16 

Euler totient function [M0299] 

561, 1105, 1729, 2465, 2821, 6601, 8911, 10585, 15841, 29341, 41041, 46657, 52633, 62745 

Carmichael numbers [M5462] 

1.2, 2, 3, 2, 4, 2, 4, 3, 4, 2, 6, 2, 4, 4, 5, 2, 6, 2, 6, 4, 4, 2, 8, 3, 4, 4, 6, 2, 8, 2, 6, 4, 4, 4, 9, 2, 4, 4, 8 

number of divisors of n [M0246] 

1, 3, 4, 7, 6, 12, 8, 15, 13, 18, 12, 28, 14, 24, 24, 31, 18, 39, 20, 42, 32, 36, 24, 60, 31, 42, 40, 56 

sum of divisors of n [M2329] 

6, 28, 496, 8128, 33550336, 8589869056, 137438691328, 2305843008139952128 

perfect numbers [M4186] 

Partitions (§2.5.1) 

1, 1, 2, 3, 5, 7, 11, 15, 22, 30, 42, 56, 77, 101, 135, 176, 231, 297, 385, 490, 627, 792, 1002, 1255 

partitions of n [M0663] 

1,1,1, 2, 2, 3, 4, 5, 6, 8, 10, 12, 15, 18, 22, 27, 32, 38, 46, 54, 64, 76, 89, 104, 122, 142, 165, 192 

partitions of n into distinct parts [M0281] 

1, 3, 6, 13, 24, 48, 86, 160, 282, 500, 859, 1479, 2485, 4167, 6879, 11297, 18334, 29601, 47330 

planar partitions of n [M2566] 
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Figurate Numbers (§3.1.8) 
polygonal 

1, 3, 6, 10, 15, 21, 28, 36, 45, 55, 66, 78, 91, 105, 120, 136, 153, 171, 190, 210, 231, 253, 276 

triangular [M2535] 


1, 5, 12, 22, 35, 51, 70, 92, 117, 145, 176, 210, 247, 287, 330, 376, 425, 477, 532, 590, 651, 715 

pentagonal [M3818] 


1, 6, 15, 28, 45, 66, 91, 120, 153, 190, 231, 276, 325, 378, 435, 496, 561, 630, 703, 780, 861, 946 

hexagonal [M4108] 


1, 7, 18, 34, 55, 81, 112, 148, 189, 235, 286, 342, 403, 469, 540, 616, 697, 783, 874, 970, 1071 

heptagonal [M4358] 


1, 8, 21, 40, 65, 96, 133, 176, 225, 280, 341, 408, 481, 560, 645, 736, 833, 936, 1045, 1160, 1281 

octagonal [M4493] 


pyramidal 

1, 4, 10, 20, 35, 56, 84, 120, 165, 220, 286, 364, 455, 560, 680, 816, 969, 1140, 1330, 1540, 1771 

3-dimensional triangular, height n [M3382] 

1, 5, 14, 30, 55, 91, 140, 204, 285, 385, 506, 650, 819, 1015, 1240, 1496, 1785, 2109, 2470, 2870 

3- dimensional square, height n [M3844] 

1, 6, 18, 40, 75, 126, 196, 288, 405, 550, 726, 936, 1183, 1470, 1800, 2176, 2601, 3078, 3610 

3-dimensional pentagonal, height n [M4116] 

1, 7, 22, 50, 95, 161, 252, 372, 525, 715, 946, 1222, 1547, 1925, 2360, 2856, 3417, 4047, 4750 

3- dimensional hexagonal, height n [M4374] 

1, 8, 26, 60, 115, 196, 308, 456, 645, 880, 1166, 1508, 1911, 2380, 2920, 3536, 4233, 5016, 5890 

3- dimensional heptagonal, height n [M4498] 

1, 5, 15, 35, 70, 126, 210, 330, 495, 715, 1001, 1365, 1820, 2380, 3060, 3876, 4845, 5985, 7315 

4- dimensional triangular, height n [M3853] 

1, 6, 20, 50, 105, 196, 336, 540, 825, 1210, 1716, 2366, 3185, 4200, 5440, 6936, 8721, 10830 

4- dimensional square, height n [M4135] 

1, 7, 25, 65, 140, 266, 462, 750, 1155, 1705, 2431, 3367, 4550, 6020, 7820, 9996, 12597, 15675 

4- dimensional pentagonal, height n [M4385] 

1, 8, 30, 80, 175, 336, 588, 960, 1485, 2200, 3146, 4368, 5915, 7840, 10200, 13056, 16473 

4-dimensional hexagonal, height n [M4506] 

1, 9, 35, 95, 210, 406, 714, 1170, 1815, 2695, 3861, 5369, 7280, 9660, 12580, 16116, 20349 

4-dimensional heptagonal, height n [M4617] 
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Polyominoes (§3.1.8) 

1,1,2, 5, 12, 35, 108, 369, 1285, 4655, 17073, 63600, 238591, 901971, 3426576, 13079255 

squares, n cells [M1425] 

1,1,1, 3, 4, 12, 24, 66, 160, 448, 1186, 3334, 9235, 26166, 73983, 211297 

triangles, n cells [M2374] 

1, 1, 3, 7, 22, 82, 333, 1448, 6572, 30490, 143552, 683101 

hexagons, n cells [M2682] 

1, 1, 2, 8, 29, 166, 1023, 6922, 48311, 346543, 2522572, 18598427 

cubes, n cells [M1845] 


Trees (§9.3) 

1,1,1, 2, 3, 6, 11, 23, 47, 106, 235, 551, 1301, 3159, 7741, 19320, 48629, 123867, 317955 

n unlabeled vertices [M0791] 

1,1,2, 4, 9, 20, 48, 115, 286, 719, 1842, 4766, 12486, 32973, 87811, 235381, 634847, 1721159 

rooted, n unlabeled vertices [Ml 180] 

1,1,3, 16, 125, 1296, 16807, 262144, 4782969, 100000000, 2357947691, 61917364224 

n labeled vertices [M3027] 

1, 2, 9, 64, 625, 7776, 117649, 2097152, 43046721, 1000000000, 25937424601, 743008370688 

rooted, n labeled vertices [M1946] 


by diameter 


1, 2, 5, 8, 14, 21, 32, 45, 65, c 


i, 121, 161, 215, 280, 367, 471, 607, 771, 980, 1232, 1551, 1933 

diameter 4, n > 5 vertices [M1350] 


1, 2, 7, 14, 32, 58, 110, 187, 322, 519, 839, 1302, 2015, 3032, 4542, 6668, 9738, 14006, 20036 

diameter 5, n > 6 vertices [M1741] 

1, 3, 11, 29, 74, 167, 367, 755, 1515, 2931, 5551, 10263, 18677, 33409, 59024, 102984, 177915 

diameter 6, n > 7 vertices [M2887] 


1, 3, 14, 42, 128, 334, 850, 2010, 4625, 10201, 21990, 46108, 94912, 191562, 380933, 746338 

diameter 7, n > 8 vertices [M2969] 

1, 4, 19, 66, 219, 645, 1813, 4802, 12265, 30198, 72396, 169231, 387707, 871989, 1930868 

diameter 8, n > 9 vertices [M3552] 


by height 

1, 3, 8, 18, 38, 76, 147, 277, 509, 924, 1648, 2912, 5088, 8823, 15170, 25935, 44042, 74427 

height 3, n > 4 vertices [M2732] 

1, 4, 13, 36, 93, 225, 528, 1198, 2666, 5815, 12517, 26587, 55933, 116564, 241151, 495417 

height 4, n > 5 vertices [M3461] 
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series-reduced 

1. 1. 0. 1. 1. 2. 2. 4. 5. 10. 14. 26. 42. 78. 132. 249. 445. 842. 1561. 2988. 5671. 10981. 21209 

n vertices [M0320] 

1.1.0. 2. 4. 6. 12. 20. 39. 71. 137. 261. 511. 995. 1974. 3915. 7841. 15749. 31835. 64540 

rooted, n vertices [M0327] 


0, 1, 0, 1, 1, 2, 3, 6, 10, 19, 35, 67, 127, 248, 482, 952, 1885, 3765, 7546, 15221, 30802, 62620 

planted, n vertices [M0768] 


Graphs (§8.1, §8.3, §8.4, §8.9) 

1, 2, 4, 11, 34, 156, 1044, 12346, 274668, 12005168, 1018997864, 165091172592 

n vertices [M1253] 


chromatic number 

4, 6, 7, 7, 8, 9, 9, 10, 10, 10, 11, 11, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 15, 15, 15, 15, 16, 16 

surface, connectivity n > 1 [M3265] 


4, 7, 8, 9, 10, 11, 12, 12, 13, 13, 14, 15, 15, 16, 16, 16, 17, 17, 18, 18, 19, 19, 19, 20, 20, 20, 21 

surface, genus n > 0 [M3292] 


genus 

0, 0, 0, 0, 1, 1, 1, 2, 3, 4, 5, 6, 8, 10, 11, 13, 16, 18, 20, 23, 26, 29, 32, 35, 39, 43, 46, 50, 55, 59, 63 

complete graphs, n vertices [M0503] 


connected 

1, 1, 2, 6, 21, 112, 853, 11117, 261080, 11716571, 1006700565, 164059830476 

n vertices [M1657] 

1,1,0, 2, 5, 32, 234, 3638, 106147, 6039504, 633754161, 120131932774, 41036773627286 

series-reduced, n vertices [M1548] 

1, 1, 3, 5, 12, 30, 79, 227, 710, 2322, 8071, 29503, 112822, 450141 

n edges [M2486] 

1, 1, 4, 38, 728, 26704, 1866256, 251548592, 66296291072, 34496488594816 

n labeled vertices [M3671] 


directed 

1, 3, 16, 218, 9608, 1540944, 882033440, 1793359192848, 13027956824399552 

n vertices [M3032] 


1,3,9,33,139,718,4535 


transitive, n vertices [M2817] 


1,1,2, 4, 12, 56, 456, 6880, 191536, 9733056, 903753248, 154108311168, 48542114686912 

tournaments, n vertices [M1262] 
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1, 4, 29, 355, 6942, 209527, 9535241, 642779354, 63260289423, 8977053873043 

transitive, n labeled vertices [M3631] 


various 

1, 2, 2, 4, 3, 8, 4, 14, 9, 22, 8, 74, 14, 56, 48, 286, 36, 380, 60, 1214, 240, 816, 188, 15506, 464 

transitive, n vertices [M0302] 

1, 1, 2, 3, 7, 16, 54, 243, 2038, 33120, 1182004, 87723296, 12886193064, 3633057074584 

all degrees even, n vertices [M0846] 

1, 0, 1, 1, 4, 8, 37, 184, 1782, 31026, 1148626, 86539128, 12798435868, 3620169692289 

Eulerian, n vertices [M3344] 


1,0,1,3,8,48,383,6020 

1,2,2,4,3,8,6,22,26,176 


Hamiltonian, n vertices [M2764] 


regular, n vertices [M0303] 


0, 1, 1, 3, 10, 56, 468, 7123, 194066, 9743542, 900969091, 153620333545, 48432939150704 

nonseparable, n vertices [M2873] 


1,2,4,11,33,142,822,6910 


planar, n vertices [M1252] 


3.2 GENERATING FUNCTIONS 

Generating functions express an infinite sequence as coefficients arising from a power 
series in an auxiliary variable. The closed form of a generating function is a concise way 
to represent such an infinite sequence. Properties of the sequence can be explored by 
analyzing the closed form of an associated generating function. Two types of generating 
functions are discussed in this section — ordinary generating functions and exponential 
generating functions. The former arise when counting configurations in which order is 
not important, while the latter are appropriate when order matters. 


3.2.1 ORDINARY GENERATING FUNCTIONS 
Definitions: 

The ( ordinary ) generating function for the sequence «o, « i , a 2 , ... of real numbers 
is the formal power series f(x) = ao + a±x + CL2X 2 + • • • = aiXl or an y equivalent 

closed form expression. 

The convolution of the sequence ao,ai,a2,... and the sequence foo , , £>2 , - - • is the 

sequence c 0 , ci, c 2 , . . . in which c t = a 0 b t + aib t -i + a 2 b t - 2 H h a t b Q = YX=o a kh-k- 
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Facts: 

1. Generating functions are considered as algebraic forms and can be manipulated as 
such, without regard to actual convergence of the power series. 

2. A rational form (the ratio of two polynomials) is a concise expression for the generat- 
ing function of the sequence obtained by carrying out long division on the polynomials. 
(See Example 1.) 

3. Generating functions are often useful for constructing and verifying identities in- 
volving binomial coefficients and other special sequences. (See Example 10.) 

4. Generating functions can be used to derive formulas for the sums of powers of 
integers. (See Example 17.) 

5. Generating functions can be used to solve recurrence relations. (See §3.3.4.) 

6. Each sequence {a n } defines a unique generating function f(x), and conversely. 

7. Related generating functions: Suppose f(x) = a kX k and g{x) = YlkLo bkX k 

are generating functions for the sequences do, di, 02 , . . . and bo, b±, b 2 , • • ■, respectively. 
Table 1 gives some related generating functions. 

Table 1 Related generating functions. 


generating function 

sequence 

x n f(x) 

0, 0, 0, . . . , 0, do, di, 02, • • . 

v v ' 

f(x) - a n x n 

n 

d 0 , Q-lj • • • 5 &n— 1? 0) ^n+l} • • • 

a 0 + a\x + ■ ■ ■ + a n x n 

do, d\ , . . . , (In, 0, 0, . . . 

f{x 2 ) 

a 0) 0: d 1 , 0, d2, 0, d3, . . . 

f(x)-a 0 

X 

0,2, a 3 , . . . 

/'(*) 

a 1 , 2 a 2 , 3 a 3 ,...,ka k , ... 

fo f(t) dt 

n n a, 1 a 2 ak 

“'O’ 2 ’ 3 ’ * * ‘ ’ /c +1 ’ * ' • 

f(x) 

1 —x 

do, do + d!, do + d! + d2, . . . 

rf(x) + sg(x) 

ra 0 + sb 0 , ra\ + sb±,ra 2 + sb 2 , . . . 

f{x)g{x) 

a 0 b 0 ,a 0 bi + di b 0 , a 0 b 2 + di&i + a 2 b 0 , . . . 


(convolution of {a n } and {b n }) 


Examples: 

1. The sequence 0, 1, 4, 9, 16, . . . of squares of the nonnegative integers has the gener- 
ating function 0 + x + Ax 2 + 9x 3 + 16x 4 + ■ ■ ■■ However, this generating function has a 
concise closed form expression, namely 1 _ 3x ^f 3x 2_ x 3 ■ Verification is obtained by carry- 
ing out long division on the indicated polynomials. This concise form can be used to 
deduce properties involving the sequence, such as an explicit algebraic expression for 
the sum of squares of the first n positive integers. (See Example 17.) 

2. The generating function for the sequence 1, 1, 1, 1, 1, . . . is 1 + x + x 2 + x 3 + x 4 + • ■ • = 

yd— . Differentiating both sides of this expression produces 1 + 2x + 3x 2 + 4x 3 + • • • = 
jyyVp- Thus, is a closed form expression for the generating function of the 

sequence 1,2, 3,4, (See Table 2.) 
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Table 2 Generating functions for particular sequences. 


sequence 

closed form 

1 , 1 , 1 , 1 , 1 ,... 

1 

1—x 

1,1,..., 1,0,0,... (n Is) 

l—x n 

1—x 

1,1, ...,1,1,0, 1,1,... 

(0 following n Is) 

1 _ ~.n 

1-x 

1, —1, 1, —1, 1, —1, . - - 

1 

1+x 

1,0, 1,0,1,... 

1 

1—x 2 

1,2, 3, 4, 5,... 

1 

(1—x) 2 

1,4,9,16,25,... 

1+x 
( 1—x ) 3 

l,r, r 2 ,r 3 ,r 4 , . . . 

1 

1—rx 

0, r, 2 r 2 , 3r 3 ,4r 4 , . . . 

rx 

(1—rx) 2 

0 1 1 1 1 1 

’ ’ 2 ’ 3’ 4 ’ 5 ’ ‘ ‘ 

In , 1 

1—x 

11111 

0! ’ 1! ’ 2! ’ 3! ’ 4! ’ • * * 

e x 

M, 2 > 3 , 4 ; 5 i • 

ln(l + x) 

F 0 , Fi,F 2 , F 3 , F4, . . . 

X 

1—x—x 2 

Lq, Li, L 2i L 3 , L 4 , • ■ • 

2-x 

1—x—x 2 

r* r* r 1 r* 

t-'O? ^1? ^2j ^3 7 W? • • • 

l-x/1-4* 

2x 

h 0 ,h u h 2 ,h 3 ,h 4 ,... 

+- l n —L_ 

1—x 1—x 


3. Table 2 gives closed form expressions for the generating functions of particular se- 
quences. In this table, r is an arbitrary real number, F n is the nth Fibonacci num- 
ber (§3.1.2), L n is the nth Lucas number (§3.1.2), C n is the nth Catalan number (§3.1.3), 
and H n is the nth harmonic number (§3.1.7). 


4. For every positive integer n, the binomial theorem (§2.3.4) states that 

n 

(1 + x)" = a + fflx+ G)x* + ■ • • + C)x» = E GK, 


k—0 

so ( l + x) n is a closed form for the generating function of (^) , (") , ( 2 ) , • ■ • > (”) , 0, 0, 0, . . . . 


5. For every positive integer n, the Maclaurin series expansion for (1 + x) n is 


(1 + x)~ n = 1 + (~n)x + (-"K-"- 1 )* 2 + . . 


= 1+ E 

fe= 1 


(-rt)(-ra-l)(-ra-2)...(-rt-fc+l) fc 
fc! X ’ 


Consequently, (1 + x) n is the generating function for the sequence ( 0 ") , ( -") , ( 2 n ) , . . ., 
where {+) is an extended binomial coefficient (§2.3.2). 
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Table 3 Examples of binomial-type generating functions. 


generating function 

expansion 

(l + x) n 

c) + 0 + 0’ +■■-+ co = e:.o o” 

(1 + rx) n 

(0) + (1 )^ + ( 2 ) + • • • + o n * n = elo ( n k y* k 

(l + x m ) n 

(0) + + (> 2m + ■■■ + O nm = ELo O km 

(l + x)- n 

(7) + (O + (O’ + ■ ■ • = Er.o(-i)T + C V 

(1 + rx)~ n 

(7) + (7)™ + (O’*’ + ■ ■ - = Er.o(-i)T + C>V 

(1 - x)~ n 

c„”) + crx-*) + (oo + • • ■ = Er.o r-*-;- 1 )** 

(1 — rx)~ n 

Co”) + too + (o-o +■■■ = sr.o rev** 

x n 

(1 - x) n+1 

o n + n> n+i + n> n+2 +---=Ez n tv 


6. Using Example 5, the expansion of /( x) = (1 — 3x ) 8 is 

oo oo 

(1 - 3x)" 8 = (1 + y)~ 8 = E ( - k 8 )y k = E t fc 8 )(-3z) fc . 

k=0 k = 0 

So the coefficient of x 4 in f(x) is (^ 8 )(— 3) 4 = (— 1) 4 ( 8+ 4 _1 )(81) = ( 1 4 1 )(81) = 26,730. 

7. Table 3 gives additional examples of generating functions related to binomial ex- 
pansions. In this table, m and n are positive integers, and r is any real number. 

8. For any real number r, the Maclaurin series expansion for (1 + x) r is 

(1 + x) r = Ql + ([)x + (Qx 2 H 

where (£) = r ’V- 1 )C-2)...(>-fc+ 1) ^ > q an( j ^ ^ 

9. Using Example 8, the expansion of f(x) = y/l + x is 

\/l + x = (1 + x) 1 / 2 = ( 1 q 2 )1 + ( 1 ( 2 )x + (^ 2 )x 2 + ' • • 


— 1 + 2 X 


1 -1 -3 -5 


1 rp 3 5 ~ 

16 128 


= 1 + |x — gX 2 

Thus \/\ + x is the generating function for the sequence 1, 5 ? — g, jyp — its’ 

10 . Vandermonde’s convolution identity (§2.3.4) can be obtained from the generating 
functions /(x) = (1 + x) m and g(x) = (1 + x) n . First, (1 + x) m (l + x) n = (1 + x) m+n . 
Equating coefficients of x r on both sides of this equation and using Fact 7 produces 


E (?)(.”„) = ( 

k=0 


m+n 


)■ 


11. Twenty identical computer terminals are to be distributed into five distinct rooms 
so each room receives at least two terminals. The number of such distributions is the 

coefficient of x 20 in the expansion of /(x) = (x 2 +x 3 +x 4 -| ) 5 = x 10 (l+x-|-x 2 -| ) 5 = 

pEyj 5 • Thus the coefficient of x 20 in /(x) is the coefficient of x 10 in (1 — x) -5 , which 
from Table 3 is ( 5+ 4 ° _1 ) = Q = 1001. 
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12. Suppose in Example 11 that each room can accommodate at most seven terminals. 
Now the generating function is g(x ) = ( x 2 + x 3 + x 4 + x 5 + x 6 + x 7 ) 5 = £ 10 (1 + x + x 2 + 
x 3 + x 4 + a: 5 ) 5 = 2 ; 10 (©©) 5 ■ Consequently, the number of allowable distributions is the 

coefficient of a; 10 in ( © ^ ) 5 = (1 — a; 6 ) 5 (l — x)~ 3 = [l— ( 3 ):r 6 + (i)© 12 -—a; 30 ] [(”g 5 ) + 

( _ i 5 )( _a; ) + (if)©© 2 + •••]■ This coefficient is [(7 0 5 )©1) 10 - ©(if)© 1 ) 4 ] = Q ~ 

(?)(?) =65!- 


13. Unordered selections with replacement: k objects are selected from n distinct 
objects, with repetition allowed. For each of the n distinct objects, the power series 
1 + x + x 1 + • • • represents the possible choices (namely none, one, two, . . .) for that 
object. The generating function for all n objects is then 

/(© = (i+x+x*+--r = (ihr = a - ©- n = £r= 0 r + ©V- 


The number of selections with replacement is the coefficient of x k in /(©, namely 

rn- 


14. Suppose there are p types of objects, with n,; indistinguishable objects of type i. 
The number of ways to pick a total of k objects (where the number of selected objects 
of type i is at most n, j is the coefficient of x k in the generating function 

]”[ (1 + x + a: 2 H h x ni ). 

i = 1 


15. Partitions: Generating functions can be found for p(n), the number of partitions of 

the positive integer n (§2.5.1). The number of Is that appear as summands in a partition 
of n is 0 or 1 or 2 or . . ., recorded as the terms in the power series 1 + x + x 2 + x 3 H — 
The power series 1 + a; 2 + x 4 + x 6 + ■ ■ • records the number of 2s that can appear in a 
partition of n, and so forth. For example, p{ 12) is the coefficient of a: 12 in 

(1 + x + x 2 H )(1 + x 2 + x 4 + ■ ■ ■) . . . (1 + x 12 + x 24 -I ) = n iz© 5 

i—1 

or in (l + a; + a; 2 + • • ■ + x 12 ){\ + x 2 + x 4 + ■ ■ -+x 12 ) . . . (1 + a: 12 ). In general, the function 
p {x) = ns r is the generating function for the sequence p(0) , p(l) , p(2 ) , . . . , where 
p(0) is defined as 1. 

16. The function P d (x) = (l + a:)(l + a; 2 )(l + a; 3 ) . . . = n£i(l + a; *) generates Q(n), the 

number of partitions of n into distinct summands (see §2.5.1). The function P 0 ( x) = 
T© ' it© ' it© • • • = n© 0 (l-^ +1 )- 1 generating function for 0(n), the number 

of partitions of n with all summands odd (see §2.5.1). Then 

Pd{x) = (1 + ©(1 + a; 2 )(l + ai 3 )(l + x 4 ) . . . 

_ l—x 2 1 — x J 1 — a: 6 1 — a; 8 1 1 p / \ 

— l — x ' 1 — x‘~ ' 1-a- 3 ' 1 — a! 4 • • • — l — x ' l—x 3 ' ' ' — 

so Q(n) = 0{n ) for every nonnegative integer n. 


17. Summation formulas: Generating functions can be used to produce the formula 
l 2 +2 2 +- • -+n 2 = ln(n+l)(2n+l). (See §3.5.4 for an extensive tabulation of summation 
formulas.) Applying Fact 7 to the expansion (1 — a:) -1 = 1 + x + x 2 + x 3 + ■ ■ ■ produces 


'e[4(i-.)1 = S J = *+2 ! 


So is the generating function for the sequence 0 2 , l 2 , 2 2 , 3 2 , . . . and, by Fact 7, 

(l-z) 4 g enera t es the sequence 0 2 , 0 2 + 1 2 , 0 2 + 1 2 + 2 2 , 0 2 + 1 2 + 2 2 + 3 2 , Consequently, 

£”=o * 2 the coefficient of x n in 


{x + x 2 ){l~x) 4 = (x + a; 2 )[( 0 4 ) + ( ©(-© + ( 2 4 )(-© 2 + • •• . 


The answer is then („©) (—1)" 1 + („- 4 2 ) ( — l) n 2 


ra+c©) 


!n(n + l)(2n + 1). 
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Table 4 Related exponential generating functions. 


generating function 

sequence 

xf(x ) 

x n f(x) 

fix) 

fo /(/)* 

rf(x) + sg(x) 

f(x)g(x) 

0,a o ,2ai,3a 2 , •••, {k + 1 )a k , ■ ■ ■ 

0 , 0 , 0 , . . . , 0, P(n, n)a 0 , P(n + 1, n)ai,P(n + 2, n)a 2 , . . . , 

' V ' 

n P(n + k,n)ak, ■ ■ ■ 

«1, 02, 03, . . . , Ofc, . . . 

0, ao, ai, a 2 , . . . 

ra 0 + sb 0 , rai + sb\,ra 2 + sb 2 , . . . 

(0)00^0, (o)a 0 &i + Q)oi6o, (q)oo6 2 + (1)0161 + (2)0260, • • ■ 
(binomial convolution of {a*,} and {fy}) 


18. Catalan numbers: The Catalan numbers (§3.1.3) Co, Ci, C 2 , . . . satisfy the recur- 
rence relation C„ = Cq C n - 1 + Ci C„ - 2 + ' • - + C n -iCo, n > 1, with Co = 1. (See §3.3.1.) 
Hence their generating function f(x) = J^'kLo CkX k satisfies xf 2 ( x) = /( x) — 1, yielding 
f(x) = ^-(1 — y/l — 4x ) = ^-(1 — (1 — 4a;) 1 / 2 ). (The negative square root is chosen 
since the numbers Cj cannot be negative.) Applying Example 8 to (1 — 4a;) 1 / 2 yields 

/(*) = ^[i-Er=o(T)(-4)^ fe ] = = EZokh(i k V- 

Thus C n = ^( 2 ;). 


3.2.2 EXPONENTIAL GENERATING FUNCTIONS 

k 

Encoding the terms of a sequence as coefficients of fy is often helpful in obtaining 
information about a sequence, such as in counting permutations of objects (where the 
order of listing objects is important). The functions that result are called exponential 
generating functions. 

Definitions: 

The exponential generating function for the sequence a 0 , Ui . a 2 , ■ ■ ■ of real numbers 
is the formal power series f(x) = ao + a\X + a 2 fy + • • • = a i fr or an Y equivalent 

closed form expression. 

The binomial convolution of the sequence ao, ai, a 2 , . . . and the sequence &o, bi, 62, • • • 
is the sequence c 0 ,ci,c 2 , ... in which c t = Qaofy + (J)aify_i + (2)0264-2 + - • ■ + { t t )a t b 0 = 

Sfc=0 

Facts: 

1. Each sequence {a n } defines a unique exponential generating function f(x), and 
conversely. 

2. Related exponential generating functions: Suppose f{x) = YlkLo a kT\ an d g{ x) = 

Y^kLobkjr are exponential generating functions for the sequences Oo,ai,a 2 ,... and 
60, bi, 6 2 , • • respectively. Table 4 gives some related exponential generating functions. 
[P(n,k) = (l)k\ is the number of fc-permutations of a set with n distinct objects. (See 
§2.3.1.)] 
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Table 5 Exponential generating functions for particular sequences. 


sequence 

closed form 

1. 1 . 1 . 1 . 1.. .. 

1 ,- 1 , i, -i,i,... 

1.0. 1.0.1.... 

0,1, 0,1,0,... 

0,1, 2, 3, 4,... 

P(n, 0), P(n, 1), . . . , P(n, n), 0, 0, . . . 

Bo, B 1 ,B 2 ,B 3l P 4 , • • • 

-Dqi Bx,D 2 , D 3l D±, . . . 

e x 

e~ x 

\{e x + e~ x ) 
\{e x — e~ x ) 

xe x 

(l + z) n 
r i n 

n\ (1 — cc) J 

if! [* - 1]” 

e x — l 

e e 1 

e~ x 

1—x 


Examples: 

1. The binomial theorem (§2.3.4) gives 

(1 + *)“ = Q + (1)x + Q* 2 + + ■ ■ ■ + (>" 

= P(n, 0) + P(», l)x + P(n, 2) + P(», 3) + • • . + P(n, n) g . 

Hence (1 + x) n is the exponential generating function for the sequence P(n, 0), P(n, 1), 
P(n, 2), P(n, 3), . . . , P(n, n), 0,0,0,.... 

2 3 

2. The Maclaurin series expansion for e x is e x = 1 + x + ^ ^ + • • ■ , so the func- 

tion e x is the exponential generating function for the sequence 1,1,1, 1,.... The func- 

2 3 

tion e~ x = 1 — — 'is the exponential generating function for the sequence 

1,— 1,1,— 1, Consequently, 

\{e x + e~ x ) = i + Ir + lr + ••• 

is the exponential generating function for 1,0,1, 0,1,0,..., while 

i (e *_ e -, )=a . + P + P + ... 

is the exponential generating function for 0, 1,0, 1,0,1, . . . . 

3. The function f(x) = y©y = Y^LqX 1 = X^o*-fr * s the exponential generating 
function for the sequence 0!,1!,2!,3!,.... 

4. Table 5 gives closed form expressions for the exponential generating functions of 
particular sequences. In this table, [£] is a Stirling cycle number, { £ } is a Stirling sub- 
set number, B n is the nth Bell number (§2.5.2), and D n is the number of derangements 
of n objects (§2.4.2). 

5. The number of ways to permute 5 of the 8 letters in TERMINAL is found using 
the exponential generating function f(x) = (1 + a:) 8 . Here each of the 8 letters in 
TERMINAL is accounted for by the factor (1 + a;), where 1(= a; 0 ) indicates the letter 
does not occur in the permutation and x{= a; 1 ) indicates that it does. The coefficient 
of f* in f(x) is (®)5! = P(8, 5) = 6,720. 
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6. The number of ways to permute 5 of the letters in TRANSPORTATION is found as 

the coefficient of ^ in the exponential generating function f(x) = (l + x+ ^ + |,-)(1 + 

x + §r) 4 (l + x) 3 . Here the factor 1 + x + ^ ^ accounts for the letter T which can 

2 

be used 0, 1, 2, or 3 times. The factor 1 + x + ^ occurs four times — for each of R, 

A, N, and O. The letters S, P, and I produce the factor (1 + a;). The coefficient of x 5 in 

/( x) is found to be so the answer is (^p)5! = 19,480. 

7. The number of ternary sequences (made up of Os, Is, and 2s) of length 10 with at 

least one 0 and an odd number of Is can be found using the exponential generating 
function 

f(x) = (x + |r + fr + " Of® + fr + fr + " ')(i + £+ §r + fr + ---) 

= ( e x - l)\{e x - e~ x )e x = \[e 3x - e 2x - e x + 1) 

_ 1 y ( 3x ) _ y^ ( 2x ) _ y^ 

— 2 1 2 ^ u a a 

\i — 0 2—0 2=0 

The answer is the coefficient of in f(x), which is |(3 10 — 2 10 — l 10 ) = 29,012. 

8. Suppose in Example 7 that no symbol may occur exactly two times. The exponential 

generating function is then f(x) = (l + x+ ^ + • • -) 3 = (e x — ^ ) 3 = e 3x - \x 2 e 2x + 

^x 4 e x - |i 6 . The number of ternary sequences is the coefficient of ' |0! in f(x), namely 
3 10 - |(10)(9)2 8 + |(10)(9)(8)(7)1 6 = 28,269. 

9. Exponential generating functions can be used to count the number of onto functions 

ip: A — > B where \A\ = m and \B\ = n. Each such function is specified by the sequence 
of to values p{a\), pfa), ■ ■ ■ , where each element b £ B occurs at least once in 

this sequence. Element b contributes a factor (a: + fr + fr + ’ ’ ’) = (e x — 1) to the 
exponential generating function f{x) = (e x — l) n . The number of onto functions is the 
coefficient of in f(x), or n\ times the coefficient of in -v From Table 5, the 

answer is then n ! { ™ } . 


3.3 RECURRENCE RELATIONS 

In a number of counting problems, it may be difficult to find the solution directly. 
However, it is frequently possible to express the solution to a problem of a given size 
in terms of solutions to problems of smaller size. This interdependence of solutions 
produces a recurrence relation. Although there is no practical systematic way to solve all 
recurrence relations, this section contains methods for solving certain types of recurrence 
relations, thereby providing an explicit formula for the original counting problem. The 
topic of recurrence relations provides the discrete counterpart to concepts in the study 
of ordinary differential equations. 


3.3.1 BASIC CONCEPTS 
Definitions: 

A recurrence relation for the sequence ao,ai, 02 ,... is an equation relating the 
term a n to certain of the preceding terms a^, i < n, for each n > no . 
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The recurrence relation is linear if it expresses a n as a linear function of a fixed number 
of preceding terms. Otherwise the relation is nonlinear. 

The recurrence relation is kth-order if a n can be expressed in terms of o„_i, a n - 2 , . . . , 

^n—k- 

The recurrence relation is homogeneous if the zero sequence ao = aq = • • • = 0 satisfies 
the relation. Otherwise the relation is nonhomogeneous. 

A fcth-order linear homogeneous recurrence relation with constant coefficients is an 
equation of the form C n a n + C n -ia n -i + ■ ■ ■ + C n -ka n -k = 0 , n > fc, where the C* are 
real constants with C n ^ 0, C n -k ^ 0. Initial conditions for this recurrence relation 
specify particular values for k of the cq (typically Oq, a ±, . . . , a k ~ 1 ). 


Facts: 

1. A fcth-order linear homogeneous recurrence relation with constant coefficients can 

also be written C n+k a n+k + C n+k -ia n+k -i H b C n a n = 0, n > 0. 

2 . There are in general an infinite number of solution sequences {a n } to a fcth-order 
linear homogeneous recurrence relation (with constant coefficients). 

3 . A fcth-order linear homogeneous recurrence relation with constant coefficients to- 
gether with fc initial conditions on consecutive terms ao, ai, . . . , a k -\ uniquely deter- 
mines the sequence {a„}. This is not necessarily the case for nonlinear relations (see 
Example 2) or when nonconsecutive initial conditions are specified (see Example 3). 

4 . The same recurrence relation can be written in different forms by adjusting the 
subscripts. For example, the recurrence relation a n = 3a„_i, n > 1, can be written as 
a n+ i = 3 a n , n > 0. 

Examples: 

1. The relation a n — a^ l _ 1 + 2a n -2 = 0, n > 2 is a nonlinear homogeneous recurrence re- 
lation with constant coefficients. If the initial conditions ao = 0, a\ = 1 are imposed, this 
defines a unique sequence {a n } whose first few terms are 0,1,1,— 1,-1, 3, 11, 115,.... 

2 . The first-order (constant coefficient) recurrence relation a„ +1 — a n = 3, ao = 1 is 
nonhomogeneous and nonlinear. Even though one initial condition is specified, this does 
not uniquely specify a solution sequence. Namely, the two sequences 1, —2, 1,2,... and 
1, —2, —1, \/2, • ■ • satisfy the recurrence relation and the given initial condition. 

3 . The second-order relation a „+ 2 — a n = 0, n > 0, with nonconsecutive initial condi- 
tions ai = a 3 = 0 does not uniquely specify a solution sequence. Both a n = (—1)" + 1 
and a n = 2(— l) n + 2 satisfy the recurrence and the given initial conditions. 

4 . Compound interest: If an initial investment of P dollars is made at a rate of r per- 
cent compounded annually, then the amount a n after n years is given by the recurrence 
relation a n = a n -\(l + ygg), where a 0 = P. [The amount at the end of the nth year is 
equal to the amount at the end of the (n— l)st year, a„_ 1 , plus the interest on a„_ 1 , 

r i 

Too “n— Id 

5. Fibonacci sequence: The Fibonacci numbers satisfy the second-order linear homo- 
geneous recurrence relation a n — a„- 1 — a n - 2 = 0. 

6. Bit strings: Let a n be the number of bit strings of length n. Then ao = 1 (the 
empty string) and a n = 2a n _i if n > 0. [Every bit string of length n — 1 gives rise to 
two bit strings of length n, by placing a 0 or a 1 at the end of the string of length n — 1.] 

7. Bit strings with no consecutive 0s: See §3.3.2 Example 23. 
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8 . Permutations: Let a n denote the number of permutations of { 1 , 2 , ... , n}. Then a n 
satisfies the first-order linear homogeneous recurrence relation (with nonconstant coef- 
ficients) a n + 1 = (n + l)a„, n > 1 , ai = 1 . This follows since any n-permutation 7 r 
can be transformed into an (n + l)-permutation by inserting the element n + 1 into 
any of the n + 1 available positions — either at the beginning or end of 7 r, or between 
two adjacent elements of n. To solve for a n , repeatedly apply the recurrence relation 
and its initial condition: a n = na n -\ = n(n — l)a n _2 = n(n — 1 )(n — 2)a ra _3 = • • • = 
n(n — l)(n — 2 ) . . . 2cq = n\. 

9 . Catalan numbers: The Catalan numbers (§ 3 . 1 . 3 , 3 . 2 . 1 ) satisfy the nonlinear homo- 

geneous recurrence relation C n — CoC n -i — C'iC' n _2 — • • • — C n -iCo = 0 , n > 1 , with 
initial condition Co = 1 . Given the product of n + 1 variables £12:2 • • • x n +i , let C n 
be the number of ways in which the multiplications can be carried out. For exam- 
ple, there are five ways to form the product ((£12:2)2:3)2:4, (2q(2;22:3))2;4, 

(2:i 2’2) (2:32:4), 2:1 ((2:22:3)2:4), and 2 :i(2;2(2’32:4)). No matter how the multiplications are 
performed, there will be an outermost product of the form (£’12:2 . . .Xi)(xi+i . . .x n +i). 
The number of ways in which the product 2:12:2 . . .2 can be formed is C,;_ 1 and the 
number of ways in which the product 2^+1 . . . x n +i can be formed is C n -t. Thus, 
(2:12:2 . • . 2:.j)(2;j + i . . . x n+ i) can be obtained in Ci_iC n _j ways. Summing these over the 
values i = 1 , 2 , ... ,n yields the recurrence relation. 

10. Tower of Hanoi: See Example 1 of § 2 . 2 . 4 . 

11 . Onto functions: The number of onto functions <p: A — > B can be found by devel- 
oping a nonhomogeneous linear recurrence relation based on the size of B. Let |A| = to 
and let a n be the number of onto functions from A to a set with n elements. Then 

a n = n m — (>1 — (2)012 — ( " 1 )a„_i, n > 2 , 01 = 1 . This follows since the total 

number of functions from A to B is n m and the number of functions that map A onto 
a proper subset of B with exactly j elements is 

For example, if to = 7 and n = 4 , applying this recursion gives 02 = 2 7 — 2 ( 1 ) = 126 , 
a 3 = 3 7 — 3 ( 1 ) - 3 ( 126 ) = 1 , 806 , a 4 = 4 7 - 4 ( 1 ) - 6 ( 126 ) - 4 ( 1 , 806 ) = 8 , 400 . Thus there 
are 8,400 onto functions in this case. 


3.3.2 HOMOGENEOUS RECURRENCE RELATIONS 

It is assumed throughout this subsection that the recurrence relations are linear with 
constant coefficients. 

Definitions: 

A geometric progression is a sequence ao,ai,a2,... for which — = — = ••• = 

a 0 ai 

a " +1 = • • • = r, the common ratio. 

d n 

The characteristic equation for the fcth-order recurrence relation C n a n + C n _ia n _i + 
• • • + C n _kan-k = 0 , n > k, is the equation C n r k + C„_ir fc_1 + • • • + C n _k = 0 . The 
characteristic roots are the roots of this equation. 

The sequences {ai 1) } 1 { a „ 2 ' 1 }, . . . , are linearly dependent if there exist constants 

ii, f 2) • • • , tk, not all zero, such that J 2 i= 1 ti a< n = 0 for all n > 0 . Otherwise, they are 

linearly independent. 
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Facts: 

1. General method for solving a linear homogeneous recurrence relation with constant 
coefficients : First find the general solution. Then use the initial conditions to find the 
particular solution. 

2. If the k characteristic roots r \ , r 2 , ■ ■ ■ , Tk are distinct, then r™, r % , ■ ■ ■ , r% are linearly 
independent solutions of the homogeneous recurrence relation. The general solution is 
« n = ci?’™ + C 2?'2 + • • • + Cfcr£ , where ci, C 2 , . . . , Ck are arbitrary constants. 

3. If a characteristic root r has multiplicity m, then r n ,nr n , . . . ,n m ^ 1 r n are linearly 
independent solutions of the homogeneous recurrence relation. The linear combination 
C\r n + C 2 nr n + • • • + c m ?T.™ l_ 1 r™ is also a solution, where ci,C 2 , . . . ,c m are arbitrary 
constants. 

4. Facts 2 and 3 can be used together. If there are k characteristic roots rq, r 2 , . . . , r^, 
with respective multiplicities mi, m 2 , ■ ■ ■ ,mk (where some of the can equal 1 ), the 
the general solution is a sum of sums, each of the form appearing in Fact 3. 

5. DeMoivre ’s theorem: For any positive integer n, (cos 9 + i sin 9) n = cos n9 + i sin n6. 
This result is used to find solutions of recurrence relations when the characteristic roots 
are complex numbers. (See Example 10.) 

6 . Solving first-order recurrence relations: The solution of the homogeneous recurrence 
relation a n+ \ = da n , n > 0, with initial condition a 0 = A, is a n = Ad n , n > 0. 

7. Solving second-order recurrence relations: Let ri , V 2 be the characteristic roots asso- 
ciated with the second-order homogeneous relation C n a n + C n -\a n -\ + C n - 2 a n -2 = 0. 
There are three possibilities: 

• are distinct real numbers: r™ and rlf are linearly independent solutions of 
the recurrence relation. The general solution has the form 

a n = cir™ + c 2 r£, 

where the constants Ci , C 2 are found from the values of a n for two distinct values 
of n (often n = 0 , 1 ). 

• r i,i r 2 form a complex conjugate pair a±bi: The general solution is 

a n = Ci (a + bi) n + C 2 {a — bi) n = (Va 2 + b 2 ) n {k\ cos n9 + ^2 sin n6), 
with 9 = arctan( 6 /a). Here (Va 2 + b 2 ) n cos n9 and (Va 2 + b 2 ) n sin nO are lin- 
early independent solutions. 

• 7 * 1 , V 2 are real and equal: r™ and ?ir™ are linearly independent solutions of the 

recurrence relation. The general solution is 

a n = Cir™ + C 2 ?^r”. 


Examples: 

1. The geometric progression 7, 21, 63, 189, . . . , with common ratio 3, satisfies the first- 
order homogeneous recurrence relation a n+ 1 — 3 a n = 0 for all n > 0. 

2. The first-order homogeneous recurrence relation a„+i = 3 a n , n > 0, does not de- 
termine a unique geometric progression. Any geometric sequence with ratio 3 is a 
solution; for example the geometric progression in Example 1 (with ag = 7), as well as 
the geometric progression 5, 15, 45, 135, . . . (with clq = 5). 

3. The first-order recurrence relation a „+ 1 = 3a„, n > 0, ao = 7 is easily solved using 
Fact 6. The general solution is a n = 7(3") for all n > 0. 

4. Compound interest: If interest is compounded quarterly, how long does it take for an 
investment of $500 to double when the annual interest rate is 8%? If a n denotes the value 
of the investment after n quarters have passed, then a n + 1 = a n + 0.02 a n = (1.02)a„, 
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n > 0, ao = 500. [Here the quarterly rate is 0.08/4 = 0.02 = 2%.] By Fact 6, the 
solution is a n = 500(1.02)", n > 0. The investment doubles when 1000 = 500(1.02)", so 
n = Io *°| q 2 « 35.003. Consequently, after 36 quarters (or 9 years) the initial investment 
of $500 (more than) doubles. 

5. Population growth: The number of bacteria in a culture (approximately) triples 
in size every hour. If there are (approximately) 100,000 bacteria in a culture after six 
hours, how many were there at the start? Define p n to be the number of bacteria in 
the culture after n hours have elapsed. Then p n +i = 3pn for n > 0. From Fact 5, 
p n = p 0 (3"). So 100,000 = p 0 (3 6 ) and p 0 « 137. 

6. Fibonacci sequence: The Fibonacci sequence 0, 1, 1, 2, 3, 5, 8, 13, . . . arises in varied 
applications (§3.1.2). Its terms satisfy the second-order homogeneous recurrence relation 
F n = i + F n _ 2 , n > 2, with initial conditions Fq = 0, Fi = 1. 

An explicit formula can be obtained for F n using Fact 7. The characteristic equation 
is r 2 — r — 1 = 0, with distinct real roots 1± 0 V ^ . Thus the general solution is 



Using the initial conditions F 0 = 0, Fi = 1 gives C\ = C 2 = — and the explicit 
formula 

'-A[W-(W»*°- 

7. Lucas sequence: Related to the sequence of Fibonacci numbers is the sequence of 
Lucas numbers 2, 1, 3, 4, 7, 11, 18, . . . (see §3.1.2). The terms of this sequence satisfy the 
same second-order homogeneous recurrence relation L n = L n _i + L„_ 2 , n > 2, but with 
the different initial conditions Lq = 2, L\ = 1. The formula for L n is 

= (*¥/” + (^)“, » > o. 

8. Random walk: A particle undergoes a random walk in one dimension, along the 
x-axis. Barriers are placed at positions x = 0 and x = T. At any instant, the particle 
moves with probability p one unit to the right; with probability q = 1 — p it moves one 
unit to the left. Let a n denote the probability that the particle, starting at position 
x = n, reaches the barrier x = T before it reaches the barrier x = 0. It can be 
shown that a n satisfies the second-order recurrence relation a n = pa n +\ + qa n - 1 or 
pa n + 1 — a n + qa n -i = 0. In this case the two initial conditions are oo = 0 and ot = 1. 
The characteristic equation pr 2 — r + q = (pr — q)(r — 1) = 0 has roots 1, ^ . When p ^ g, 
the roots are distinct and the first case of Fact 7 can be used to determine a n \ when 
p = q, the third case of Fact 7 must be used. (Explicit solutions are given in §7.5.2, 
Fact 10.) 

9. The second-order relation a n + 4a„_i — 21a n _2 = 0, n > 2, has the characteristic 
equation r 2 + 4r — 21 = 0, with distinct real roots 3 and —7. The general solution to 
the recurrence relation is 

a n = ci (3)" + c 2 (— 7)", n > 0, 
where Ci, C 2 are arbitrary constants. 

If the initial conditions specify ao = 1 and a\ = 1, then solving the equations 
1 = ao = Ci + C 2 , 1 = cl\ = 3ci — 7c 2 gives ci = |, C 2 = g. In this case, the unique 
solution is 

a n = 1 3" + |(— 7)", n > 0. 
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10 . The second-order relation a n — 6a„_i + 58a n _2 = 0, n > 2, has the characteristic 
equation r 2 — 6r + 58 = 0, with complex conjugate roots r = 3 ±7?’. The general solution 
is 

ci n = Ci (3 + 7 i) n + C2(3 — 7 i) n , n > 0. 

Using Fact 5, (3 + 7 i) n = [\/3 2 + 7 2 (cos d + zsin d)] n = (\/58 ) n (cos nd + isin nd), where 
6 = arctan |. Likewise (3 — 7 i) n = (v / 58)™(cos nd — island). This gives the general 
solution 

a n = (758)"[(ci + C 2) cos nd + (ci — c 2 )isinn # ] = (\/58 ) n \k\ cos nd + fe sinn#]. 

If the initial conditions ao = 1 and a\ = 1 are specified, then 1 = ao = ki, 1 = 01 = 
x/58 [cos d + k 2 sin d), yielding k\ = 1, = — 2 . Thus 

a n = (\/58 )" [cos nd — 2 sin nd], n > 0. 

11 . The second-order relation a„+ 2 — 6a n +i + 9 a n = 0 , n > 0, has the characteristic 
equation r 2 — 6r + 9 = (r — 3) 2 = 0, with the repeated roots 3, 3. The general solution 
to this recurrence is 

o-n = c i(3 ra ) + C2n(3"), n > 0. 

If the initial conditions are ao = 2 and ai = 4, then 2 = ao = Ci, 4 = 2(3) + C2(l)(3), 
giving ci = 2, C2 = — |. Thus 

a„ = 2(3") - |n(3") = 2(3” - nS”" 1 ), n > 0. 

12 . For n > 1, let a n count the number of binary strings of length n that contain no 
consecutive 0s. Here ai = 2 (for the two strings 0 and 1) and 02 = 3 (for the strings 01, 
10, 11). For n > 3, a string counted in a n ends in either 1 or 0. If the nth bit is 1, then 
the preceding n — 1 bits provide a string counted in a n _i; if the nth bit is 0 then the 
last two bits are 10, and the preceding n — 2 bits give a string counted in a n _2- Thus 
a n = a„_i + a„_ 2, n > 3, with ai = 2 and 02 = 3. The solution to this relation is 
simply a n = F n+ 2, the Fibonacci sequence shifted two places. An explicit formula for 
a n is obtained using the result in Example 6. 

13 . The third-order recurrence relation a„+ 3 — a„+ 2 — 4a„+i + 4a n = 0, n > 0, has the 
characteristic equation r 3 — r 2 — 4r + 4 = (r — 2)(r + 2)(r — 1) = 0, with characteristic 
roots 2, —2, and 1. The general solution is given by 

a n = ci 2” + c 2 (— 2) n + c 3 l n = Ci 2" + c 2 (-2)” + c 3 , n > 0. 

14 . The general solution of the third-order recurrence relation a n+3 — 3a n+ 2 — 3a n+ i + 
a n = 0, n > 0, is 

a n = cil ra + C 2 nl n + c 3 n 2 l” = ci + c 2 n + c 3 n 2 , n > 0. 

Here the characteristic roots are 1, 1, 1. 

15 . The fourth-order relation a n + 4 + 2a„+2 + a n = 0, n > 0, has the characteristic 
equation r 4 + 2 r 2 + 1 = (r 2 + l) 2 = 0. Since the characteristic roots are ±i, ±i, the 
general solution is 

a n = cii n + c 2 (— i) n + c 3 to" + c±n(—i) n 

= k\ cos + k 2 sin ^ + fc 3 n cos ^ + k±n sin n > 0. 
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3.3.3 NONHOMOGENEOUS RECURRENCE RELATIONS 


It is assumed throughout this subsection that the recurrence relations are linear with 
constant coefficients. 

Definition: 

The fcth-order nonhomogeneous recurrence relation has the form C n a n + C n -ia n -i + 

• • • + C n -ka n -k = f{n ), n > k, where C n / 0, C n -k / 0, and f(n) ^ 0 for at least one 
value of n. 

Facts: 

1. General solution: The general solution of the nonhomogeneous fcth-order recurrence 
relation has the form 

( h ) . ( p ) 

& n — ) 

where a'n 1 is the general solution of the homogeneous relation C n a n + C n -\a n -i + 

• • • + C n -ka n -k = 0, n > k, and affl is a particular solution for the given relation 
G n a n C n —iQ, n —i T- * * * T C n —kQ, n —k = /(n), ti ^ k. 

2. Given a nonhomogeneous first-order relation C n a n + C n _ia n _i = kr n , n > 1, 
where r and k are nonzero constants, 

• If r n is not a solution of the associated homogeneous relation, then a = Ar n 

for A a constant. 

• If r n is a solution of the associated homogeneous relation, then a ^ = Bnr n for 

B a constant. 

3. Given the nonhomogeneous second-order relation C n a n + C n -ia n -i + C n - 2 a n -2 = 
kr n , n > 2, where r and k are nonzero constants. 

• If r n is not a solution of the associated homogeneous relation, then off = Ar n 

for A a constant. 

• If a = cir ra + C 2 r", for r ^ r\, then an' 1 = Bnr n for B a constant. 

• If a ^ = Cir n + C 2 nr n , then affl = Cn 2 r n for C a constant. 

4. Given the /cth-order nonhomogeneous recurrence relation C n a n + C n -\a n -i + • — b 
C n -kO, n ~k = f(n). If f(n) is a constant multiple of one of the forms in the first column 
of Table 1, then the associated trial solution t(n ) is the corresponding entry in the 
second column of the table. [Here A, B, A 0 , Ai , ... , A t , r, a are real constants.] 

• If no summand of t(n) solves the associated homogeneous relation, then an' 1 = 

t(n) is a particular solution. 

• If a summand of t(n) solves the associated homogeneous relation, then multiply 

t{n) by the smallest (positive integer) power of n — say n s — so that no sum- 
mand of the adjusted trial solution n s t[n ) solves the associated homogeneous 
relation. Then aff = n s t(n) is a particular solution. 

• If f(n) is a sum of constant multiples of the forms in the first column of Table 1, 

then (adjusted) trial solutions are formed for each summand using the first two 
parts of Fact 4. Adding the resulting trial solutions then provides a particular 
solution of the nonhomogeneous relation. 
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Table 1 Trial particular solutions for C n a n + • • • + C n _ka n -k = h(n). 


h(n ) 

t{n) 

c, a constant 

A 

n 1 (t a positive integer) 

Atn 1 + At—in^ 1 H- • • • + A\n + Aq 

j.n 

Ar n 

sin an 

A sin an + B cos an 

cos an 

A sin an + B cos an 

n t r n 

^(Atrf + A t _!n t_1 H 1 - A ± n + A 0 ) 

r" sin an 

r "( A sin an + B cos an) 

r" cos an 

r"(Asinan + B cos an) 


Examples: 


1. Consider the nonhomogeneous relation a- 
solution is a n = a ^ 

So 


+ 4a n _i — 21a n _2 
a'n ' , where is the solution of a n +4a ra _i 


= 5(4"), n > 2. The 
— 21a n _2 = 0, n > 2. 


a { n ] = ci(3)" + c 2 (— 7)", n > 0. 

From the third entry in Table 1 a = A(4") for some constant A. Substituting this 
into the given nonhomogeneous relation yields A( 4") + 4A(4" _1 ) — 21A(4" -2 ) = 5(4"). 
Dividing through by 4 " -2 gives 16 A + 16 A — 21 A = 80, or A = 80/11. Consequently, 
a n = Cl (3)" + c 2 (— 7)" + ff4", n > 0. 

If the initial conditions are ao = 1 and a\ = 2, then C\ and C 2 are found using 1 = 
ci + C 2 + 80/11, 2 = 3ci — 7c 2 + 320/11, yielding 

an = -B(3") + $j(-7)" + f?(4"), n > 0. 


2. Suppose the given recurrence relation is a n + 4a n _i — 21a ra _2 = 8(3"), n > 2. Then 
it is still true that 

= ci(3") + c 2 (— 7)", n>0, 

where Ci and C 2 are arbitrary constants. By the second part of Fact 3, a particular 
solution is = An3 n . Substituting gives An3 n +4A(n—l)3 n ~ 1 ~21A(n~2)3 n ~ 2 = 
8(3"). Dividing by 3 " -2 produces 9 An + 12 A(n — 1) — 21 A(n — 2) = 72, so A = 12/5. 
Thus 

a n = c\ (3") + c 2 (— 7)" + f n3", n > 0. 


3. Tower of Hanoi: (See Example 1 of §2.2.4.) If a n is the minimum number of moves 
needed to transfer the n disks, then a n satisfies the first-order nonhomogeneous relation 

(In — 2(Z n _i -f 1, Tl ^ 1, 

where a 0 = 0. Here = c(2") for an arbitrary constant c, and ai' 1 = A , using entry 1 
of Table 1. So A = 2A + 1 or A = —1. Hence a n = c( 2") — 1 and 0 = ao = c(2°) — 1 
implies c = 1 , giving 

a n = 2" — 1, n > 0. 


4. How many regions are formed if n lines are drawn in the plane, in general position 
(no two parallel and no three intersecting at a point)? If a n denotes the number of 
regions thus formed, then ai = 2, 02 = 4, and 03 = 7 are easily determined. A general 
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formula can be found by developing a recurrence relation for a n . Namely, if line n + 1 
is added to the diagram with a n regions formed by n lines, this new line intersects all 
the other n lines. These intersection points partition line n + 1 into n + 1 segments, 
each of which splits an existing region in two. As a result, a n+ i = a n + (n + 1), n > 1, 
a first-order nonhomogeneous recurrence relation. Solving this relation with the initial 
condition a\ = 1 produces a n = |(n 2 + n + 2). 


3.3.4 METHOD OF GENERATING FUNCTIONS 

Generating functions (see §3.2.1) can be used to solve individual recurrence relations as 
well as simultaneous systems of recurrence relations. This technique is analogous to the 
use of Laplace transforms in solving systems of differential equations. 

Facts: 

1. To solve the fcth-orcler recurrence relation C n +ka n +k + • • • + C n a n = f(n), n > 0, 
carry out the following steps: 

• multiply both sides of the recurrence equation by x n+k and sum the result; 

• take this new equation, rewrite it in terms of the generating function f(x) = 

E^°= o a nX n , and solve for f(x); 

• expand the expression found for f(x) in terms of powers of x in order that the 

coefficient a n can be identified. 

2. To solve a system of /cth-order recurrence relations, carry out the following steps: 

• multiply both sides of each recurrence equation by x n+k and sum the results; 

• rewrite the system of equations in terms of the generating functions f(x ), g{x), . . . 

for a n , b n , . . ., and solve for these generating functions; 

• expand the expressions found for each generating function in terms of powers of x 

in order that the coefficients a n , b n , ■ ■ ■ can be identified. 

Examples: 

1. The nonhomogeneous first-order relation a n+ i — 2 a n = 1, n > 0, ao = 0, arises in 
the Tower of Hanoi problem (Example 3 of §3.3.3). Begin by applying the first step of 
Fact 1: 

a n+ ix n+1 — 2 a n x n+1 = x n+1 , 

oo oo oo 

E a n+1 x n+1 - 2 E a n x n+1 = E x n+1 . 

71—0 71 = 0 71—0 

Then apply the second step of Fact 1: 

OO OO OO 

E a n+ ix n+1 - 2x E a nX n = x E x n , 

71 = 0 71=0 71 = 0 

(f{x) -Oo)-2xf(x) = yzy, 

(f(x)-0)-2xf(x) = ^. 

Solving for f(x) gives 

OO OO OO 

/W = fc)FM = l4-Gx= E(2a:)"- E *" = E ( 2 ” “ !)*"• 

71=0 71 = 0 71=0 

Since a„ is the coefficient of x n in f(x), a n = 2" — 1, n > 0. 
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2 . To solve the nonhomogeneous second-order relation a n+ 2 — 2a„+i + a n = 2 n , n > 0, 
oo = 1, 01 = 2, apply the first step of Fact 1: 

a n+2 x n+2 - 2a n+1 x n+2 + a n x n+2 = 2 n x n+2 , 


E a n+2 x n+2 


n — 0 


00 oo oo 

2 E a n +ix n+ 2 + E a nX n+2 = E 2 n x n+2 . 

n=0 n — 0 n — 0 


The second step of Fact 1 produces 

OO OO OO OO 

E a n+2 x n+2 - 2x E a n +\x n+1 + x 2 E a nX n = x 2 E (2a-’)") 

n=0 n— 0 n— 0 7i=0 


[/(a;) -ao- a\x\ - 2a; [/(ar) - a 0 ] + x 2 f(x) = y^, 

lf(x) ~ 1 - 2x] - 2a;[/0) - 1] + a; 2 /©) = y£^. 
Solving for /(a;) gives 

OO OO 

f(x) = = E (2*)" = E 2"a". 

n=0 n=0 

Thus a n = 2 n , n > 0, is the solution of the given recurrence relation. 


3 . Fact 2 can be used to solve the system of recurrence relations 

2a n b n T 2 
^n+i = o n T 2b n 1 

for n > 0, with ao = 0 and bo = 1. Multiplying by x n+l and summing yields 

OO OO OO OO 

E a n +\x n+1 = 2x E - x E ^a: n + 2a; E x n 


n — 0 
oo 


n— 0 
oo 


n — 0 


n— 0 
oo 


E b n +iX n+1 = -X E a n a: n + 2a; E - a: E a:”. 

71=0 71=0 71=0 71=0 

These two equations can be rewritten in terms of the generating functions f(x) = 
E(T=o a nX n and g(x) = E^=o as 

/( x) - a 0 = 2 xf(x) - xg{x) + 2a; y^ 

5 f(a;) - b 0 = -xf( x) + 2 xg(x) - Xj^. 

Solving this system (with a 0 = 0, b 0 = 1) produces 

f(~\ _ x(l — 2x) _ -3/4 1/2 1/4 

J\X) — (i- x y(i-Zx) — 1-x (1-x) 2 (1-3*) 

OO OO oo 

= -! E x n + \ E (“V + i E(3*r 

71=0 71=0 71=0 

OO OO oo 

= -! E* n + ^ E r:v + i E3V 

71=0 71=0 71=0 

and 

_ l-4x+2x 2 _ 3/4 1/2 -1/4 

"W (1 — x) 2 (l— 3x) 1 — x ‘ 1_ (1-x) 2 ‘ +_ (1 — 3x) 

OO OO oo 

= ! e a ; n + 5 e r:y-iE3v. 


71=0 


71=0 


71=0 


It then follows that 


On — — f + \ ( n + 1) + j 3” , n > 0 
b n = | + §(« + !) - i3 n , n >0. 
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3.3.5 DIVIDE-AND-CONQUER RELATIONS 


Certain algorithms proceed by breaking up a given problem into subproblems of nearly 
equal size; solutions to these subproblems are then combined to produce a solution 
to the original problem. Analysis of such “divide-and-conquer” algorithms results in 
special types of recurrence relations that can be solved exactly and asymptotically. 

Definitions: 

The time-complexity function f(n) for an algorithm gives the (maximum) number 
of operations required to solve any instance of size n. The function f(n) is monotone 
increasing if m < n => /(m) < f(n) where m and n are positive integers. 

A recursive divide-and-conquer algorithm splits a given problem of size n - b k 
into a subproblems of size j each. It requires (at most) h(n) operations to create the 
subproblems and subsequently combine their solutions. 

Let S = S b be the set of integers {1, b, b 2 , . . .} and let Z + be the set of positive integers. 

If f(n) and g(n) are functions on Z + , then g dominates f on S, written / £ O(g) 
on S, if there are positive constants A £ 1Z, k £ Z + such that |/(n)| < A\g(n)\ holds 
for all n £ S with n > k. 

Facts: 

1. The time-complexity function f(n) of a recursive divide-and-conquer algorithm is 
defined for n £ S and satisfies the recurrence relation 

/(!) = C 

f(n) = af(n/b ) + h(n), for n = b k , k > 1, 
where a,b,c £ Z + and b > 2. 

2. Solving f(n ) = af{n/b ) + c, /( 1) = c: 

• If a = 1: /(n) = c(log b n + l) for n £ S. Thus / £ 0(\og b n) on S. If, in addition, 

f(n) is monotone increasing, then / £ 0(log b n) on Z + . 

• If a > 2: f{n ) = c(an logi >° — l)/(a — 1) for n £ S. Thus / € O(n logi >°) on S. If, 

in addition, /(n) is monotone increasing, then / £ O(n logi>a ) on Z + . 

3. Let f(n) be any function satisfying the inequality relations 

/(l) < c, 

/(n) < af{n/b ) + c, for n = b k , k > 1, 
where a,b,c £ Z + and b > 2. 

• If a = 1: f £ 0(log b n) on S. If, in addition, f(n) is monotone increasing, then 

/ £ 0(log b n) on Z+. 

• If a > 2: f £ 0(n}° Sb a ) on S. If, in addition, /(n) is monotone increasing, then 

/ £ 0(n l ° Sb a ) on Z + . 

4. Solving for a monotone increasing f[n) where f(n) = af(n/b)+rn d (n = b k , k > 1), 
/( 1) = c, where a, b,c,d £ Z + , b > 2, and r is a positive real number: 

• If a <b d : f £ 0{n d ) on Z + . 

• If a = b d : f £ 0(n d \og b n) on Z + . 

• If a>b d : f £ 0(n l ° Sb a ) on Z+. 

The same asymptotic results hold if inequalities < replace equalities in the given recur- 
rence relation. 
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Examples: 

1. If f(n) satisfies the recurrence relation /(n) = /(f) +3 , n € Sf, /(l) = 3, then by 
Fact 2 /(n) = 3(log 2 + 1). Thus / G 0(log 2 n) on 52. 

2. If f(ri) satisfies the recurrence relation f(ri) = 4f(f ) + 7, n G ST f(l) = 7, then by 
Fact 3 /(n) = 7(4 n lo «3 4 - l)/3. Thus / G 0(n lo «3 4 ) on S 3 . 

3. Binary search: The binary search algorithm (§17.2.3) is a recursive procedure to 
search for a specified value in an ordered list of n items. Its complexity function satisfies 
f(ri) = /(f) + 2, n G S 2 , /( 1) = 2. Since the complexity function f(n ) is monotone 
increasing in the list size n, Fact 2 shows that / G 0(log 2 n). 

4. Merge sort: The merge sort algorithm (§17.4) is a recursive procedure for sorting 
the n elements of a list. It repeatedly divides a given list into two nearly equal sublists, 
sorts those sublists, and combines the sorted sublists. Its complexity function satisfies 
f(n) = 2/(f ) + (n — 1), n G S 2 , /( 1) = 0. Since /(n) is monotone increasing and 
satisfies the inequality relation /(n) < 2/(f ) + n, Fact 5 gives / G 0(n log 2 n). 

5. Matrix multiplication: The Strassen algorithm is a recursive procedure for multi- 
plying two n x n matrices (see §6.3.3). One version of this algorithm requires seven 
multiplications of f x f matrices and 15 additions of f x f matrices. Consequently, its 
complexity function satisfies /(n) = 7/(f ) + 15n 2 /4, n G S 2 , /( 1) = 1. From the third 
part of Fact 5, / G O(?i los2 7 ) on Z + . This algorithm requires approximately 0(n 281 ) 
operations to multiply n x n matrices, compared to 0(n 3 ) for the standard method. 


3.4 FINITE DIFFERENCES 

The difference and antidifference operators are the discrete analogues of ordinary differ- 
entiation and antidifferentiation. Difference methods can be used for curve-fitting and 
for solving recurrence relations. 


3.4.1 THE DIFFERENCE OPERATOR 

The difference operator plays a role in combinatorial modeling analogous to that of the 
derivative operator in continuous analysis. 

Definitions: 

Let f:N ->K. 

The difference operator A f(x) = f(x + 1) — f(x) is the discrete analogue of the 
differentiation operator. 

The kth difference of / is the operator A k f{x) = A k_1 f(x+l) — A k ~ 1 f(x), for k > 1, 
with A 0 / = /. 

The shift operator E is defined by Ef(x) = f(x + 1). 

The harmonic sum H n = )C"=i '■ is the discrete analogue of the natural logarithm 
(§3.1.7). 

Note: Most of the results stated in this subsection are also valid for functions on 
non-discrete domains. The functional notation that is used for most of this subsection, 
instead of the more usual subscript notation for sequences, makes the results easier to 
read and helps underscore the parallels between discrete and ordinary calculus. 
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Facts: 

1. Linearity: A (af + fig) = a A f + f3Ag , for all constants a and (3 . 

2. Product rule: A (f(x)g(x)) = ( Ef(x))Ag(x ) + (A f(x))g(x). This is analogous to 
the derivative formula for the product of functions. 

3. A m x n = 0, for m > n, and A n x n = n\. 

n 

4. A n f{x) = J2 (- 1 ) fe G)/(a; + n - k). 

k = 0 
n 

5. f(x + n)= £ G)A fc /W- 

k—0 

n 

6. Leibniz’s theorem: A n (f(x)g(x)) = £ {T) A k .f(x) A n ~ k g(x + k). 

7. Quotient rule: A ($}) = 

8. The shift operator E satisfies Af = Ef — /, written equivalently as E = 1 + A. 

9. E n f(x) = f(x + n). 

10. The equation A C(x) = 0 implies that C is periodic with period 1. Moreover, if 
the domain is restricted to the integers (e.g., if C(n) is a sequence), then C is constant. 

Examples: 

1. If /( x) = x 3 4 5 then A f(x) = (x + l) 3 — x 3 = 3x 2 + 3x + 1. 

2. The following table gives formulas for the differences of some important functions. 
In this table, the notation x— refers to the nth falling power of x (§3.4.2). 


fix) 

Af( x) 

a 

Lit) 

(x + a)- 

n( x + a)— 1 

x n 

(")a: n_1 + (”)^ n_2 + • • • + 1 

a x 

(a - l)a x 

H x 

x— = Err 

sin a; 

2 sin(|) cos(a: + \) 

cos a: 

—2 sin(|) sin(a: + |) 


3. A 2 f{x) = f(x + 2) — 2 f(x + 1) + f(x), from Fact 4. 

4. f(x + 3) = f(x) + 3A f(x) + 3A 2 f(x) + A 3 f(x), from Fact 5. 

5. The shift operator can be used to find the exponential generating function (§3.2.2) 
for the sequence {afe}, where a/ c is a polynomial in variable k of degree n. 


OO u OO 

akx 
k\ 

k—0 k—0 


E 


^ E lc (a 0 )x k { x k E 


\ fc = 0 


k\ 


a 0 


= e xE a 0 = e x ( 1+A )a 0 = e x e xA a 0 


a o 


xAao 

1 ! 


x z A z ao 
2! 


x n A n ap 


)■ 


For example, if = k 2 + 1 then = e x (l + x + x 2 )- 

k—0 
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3.4.2 CALCULUS OF DIFFERENCES: FALLING AND RISING POWERS 


Falling powers provide a natural analogue between the calculus of finite sums and dif- 
ferences and the calculus of integrals and derivatives. Stirling numbers provide a means 
of expressing ordinary powers in terms of falling powers and vice versa. 

Definitions: 

The nth falling power of x, written x—, is the discrete analogue of exponentiation 
and is defined by 

x— = x(x — l)(x — 2) . . . (x — n + 1) 


(x + l){x + 2) . . . (x + n) 
x- = 1. 


The nth rising power of x, written x n , is defined by 

x n = x(x + l)(a; + 2) . . . (x + n — 1), 


(x — n)(x — n + 1) ... (x — 1) ’ 
x° = 1. 


Facts: 

1 . Conversion between falling and rising powers: 

x n = (— 1 )"(—*)" = (x - n + if = 

x n = ^ — l) n ( — x)~ = (x + n - If = 


(*+i)' 

i 

(x-l)" 


(x+l) n ’ 
(*-!)- ' 


2. Laws of exponents: 


m+n 


ra+n 


= x^(x - mf , 
= x m (x + to)" . 


3. Binomial theorem: ( x + y)— = (of - + (”) ^2/— H + (^)y- . 

4. The action of the difference operator on falling powers is analogous to the action of 
the derivative on ordinary powers: Ax— = nx r ^A . 

5. There is no chain rule for differences, but the binomial theorem implies the rule 

A(x + a)- = n(x + a)^—l . 

6. Newton's theorem: If f(x) is a polynomial of degree n, then 

/(*) - E ^ k - ■ 

k = 0 

This is an analogue of Maclaurin’s theorem. 

7. If f(x) = *" then A fc /(0) = { n k } • k\. 

8. Falling powers can be expressed in terms of ordinary powers using Stirling cycle 
numbers (§2.5.2): 

n 

x^=e kk-1 r~ k x k . 
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9. Rising powers can be expressed in terms of ordinary powers using Stirling cycle 
numbers (§2.5.2): 

n 

= £ El**- 

fc= 1 

10. Ordinary powers can be expressed in terms of falling or rising powers using Stirling 
subset numbers (§2.5.2): 

n n _ 

z” = E {£}** = E { n k }(-i) n ~ k x k . 

k= 1 fc= 1 


Examples: 

1. Fact 8 and Table 4 of §2.5.2 give 


x- = x 3 — 3x 2 + 2a: 1 , 

a:- = a; 4 — 6a; 3 + 11a: 2 — 6a; 1 . 


2. Fact 10 and Table 5 of §2.5.2 give 


x 2 = x- + x- , 

x 3 = x- + 3x 2 + x- , 

a; 4 = a; 4 + 6a:^ + 7x ± + a; 1 . 


3.4.3 DIFFERENCE SEQUENCES AND DIFFERENCE TABLES 

New sequences can be obtained from a given sequence by repeatedly applying the dif- 
ference operator. 

Definitions: 

The difference sequence for the sequence A = { a,j | j = 0,1,...} is the sequence 
AA = {a j+ 1 - cij | j = 0,1,...}. 

The kth difference sequence for /: Af — > 1Z is given by A fc /(0), A fe /(1), A k f(2), 

The difference table for /: Af — » TZ is the table Tf whose fcth row is the fcth difference 
sequence for /. That is, T f [k, l } = A k f{l) = A k ~ 1 f(l + 1) - A fc " 1 /(0- 

Facts: 

1. The leftmost column of a difference table completely determines the entire table, 
via Newton’s theorem (Fact 6, §3.4.2). 

2. The difference table of an nth degree polynomial consists of n + 1 nonzero rows 
followed by all zero rows. 
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Examples: 


1. If A = 0, 1, 4, 9, 16, 25, . . . is the sequence of squares of integers, then its difference 

sequence is A A = 1, 3, 5, 7, 9, Observe that A(x 2 ) = 2x + 1. 

2 . The difference table for x- is given by 



0 1 2 3 4 5 

A 0 #- = x- 
A 1 x- = 3x 2 
A 2 x- = 6x- 
A 3 x- = 6 
A A x- = 0 

0 0 0 6 24 60 • • • 

0 0 6 18 36 

0 6 12 18 

6 6 6 ••• 

0 0 ••• 


3 . The difference table for x 3 is given by 



0 1 2 3 4 5 

A°a: 3 = a; 3 

A 3 a: 3 = 3x 2 + 3x + 1 
A 2 x 3 = 6x + 6 

A 3 x 3 = 6 

A 4 x 3 = 0 

0 1 8 27 64 125 ••• 

1 7 19 37 61 ••• 

6 12 18 24 

6 6 6 

0 0 ••• 


4 . The difference table for 3^ is given by 



0 1 2 3 4 5 

A°3 X = 3* 

1 3 9 27 81 243 . . . 

A 1 3 x = 2 • 3 X 

2 6 18 54 162 .. . 

A 2 3 x = 4 • 3 X 

4 12 36 108 ... 

A 3 3 x = 8-3 x 

8 24 72 

A 4 3 x = 16 • 3 X 

16 48 ... 

A k 3 x = 2 k 3 x 



5 . Application to curve-fitting: Find the polynomial p(x) of smallest degree that 
passes through the points: (0, 5), (1, 5), (2, 3), (3, 5), (4, 17), (5, 45). The difference table 
for the sequence 5, 5, 3, 5, 17, 45 is 


5 

5 

3 

5 

17 45 ... 

0 

-2 

2 

12 

28 ... 

-2 

4 

10 

16 


6 

6 

6 



0 

0 





Newton’s theorem shows that the polynomial of smallest degree is p(x) = 5 — x-+ x- = 
x 3 — Ax 2 + 3x + 5. 
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3.4.4 DIFFERENCE EQUATIONS 


Difference equations are analogous to differential equations and many of the techniques 
are as fully developed. Difference equations provide a way to solve recurrence relations. 

Definitions: 

A difference equation is an equation involving the difference operator and/or higher- 
order differences of an unknown function. 

An antidifference of the function / is any function g such that A g = f. The notation 
A -1 / denotes any such function. 

Facts: 

1. Any recurrence relation (§3.3) can be expressed as a difference equation, and vice 
versa, by using Facts 4 and 5 of §3.4.1. 

2. The solution to a recurrence relation can sometimes be easily obtained by converting 
it to a difference equation and applying difference methods. 


Examples: 

1. To find an antidifference of 10- 3 X , use Table 1 (§3.4.1): A _1 (10-3 a: ) = 5A~ 1 (2-3 X ) = 
5-3 x + C. (Also see Table 1 of §3.5.3.) 

2. To find an antidifference of 3a:, first express x as x- and then use Table 1 (§3.4.1): 
A _1 3x = 3A _1 aA = fa;- + C = f x(x — 1) + C. 

3. To find an antidifference of x 2 , express x 2 as x- + x- and then use Table 1 (§3.4.1): 
A~ 1 x 2 = A -1 (a;- + a;-) = A~ 1 a:- + A~ l x- = | x - + \x- + C = fa:(x — l){x — 2) + 
^x(x — 1) + C. 

4. The following are examples of difference equations: 

A 3 f(x) + x 4 A 2 f(x) - /( x) = 0, 

A 3 f(x) + /( x) = x 2 . 

5. To solve the recurrence relation a n+ \ = a n + 5 n , n > 0, ao = 2, first note that 
A a n = 5 n . Thus a n = A' 1 5 n = f(5") + C. The initial condition ao = 2 now implies 
that a n = f (5 n + 7). 

6. To solve the equation a ra +i = ( na n + n)/{n + 1), n > 1, the recurrence relation is 
first rewritten as (n + l)a n +i — na n = n, which is equivalent to A (na n ) = n. Thus 
na n = A ~ l n = \n-+ C , which implies that a n = \ (n — 1) + C(-). 

7. To solve a n = 2a„_i — a n _ 2 + 2 n-2 + n — 2, n > 2, with ao = 4, a\ = 5, the 
recurrence relation is rewritten as 0 , 1+2 — 2a„+i + a n = 2 n + n, n > 0. Now, by applying 
Fact 4 of §3.4.1, the left-hand side may be replaced by A 2 a n . If the antidifference 
operator is applied twice to the resulting difference equation and the initial conditions 
are substituted, the solution obtained is 

a n = 2 n + \n- + cin + C 2 = 2 n + \n{n - l)(n - 2) + 3. 


© 2000 by CRC Press LLC 


3.5 FINITE SUMS AND SUMMATION 

Finite sums arise frequently in combinatorial mathematics and in the analysis of running 
times of algorithms. There are a few basic rules for transforming sums into possibly 
more tractable equivalent forms, and there is a calculus for evaluating these standard 
forms. 


3.5.1 SIGMA NOTATION 

A complex form of symbolic representation of discrete sums using the uppercase Greek 
letter E (sigma) was introduced by Joseph Fourier in 1820 and has evolved into several 
variations. 

Definitions: 

The sigma expression Y%= a /(*) has the value /(a) + /(a + 1) + • • • + f(b — 1) + f(b) 
if a < b (a,b £ Z), and 0 otherwise. In this expression, i is the index of summation 
or summation variable, which ranges from the lower limit a to the upper limit b. 
The interval [a, b] is the interval of summation, and f(i) is a term or summand of 
the summation. 

A sigma expression S n = /(*) is i n standardized form if the lower limit is zero 

and the upper limit is an integer- valued expression. 

A sigma expression g{k) over the set K has as its value the sum of all the 

values g(k), where k £ K. 

A closed form for a sigma expression with an indefinite number of terms is an algebraic 
expression with a fixed number of terms, whose value equals the sum. 

A partial sum of the (standardized) sigma expression S n = f{i) is the sigma 

expression S k = )Ci=o /(*)> where 0 < k < n. 

An iterated sum or multiple sum is an expression with two or more sigmas, as exem- 
plified by the double sum £A_ C X^=a /(*> 3)- Evaluation proceeds from the innermost 
sigma outward. 

A lower or upper limit for an inner sum of an iterated sum is dependent if it depends 
on an outer variable. Otherwise, that limit is independent. 

Examples: 

1. The sum /( 1) + /( 2) + /( 3) + /( 4) + /( 5) may be represented as f(i). 

2. Sometimes the summand is written as an expression, such as Y^Li( n2 + n), which 
means the same as X!n=i /( n )> where f( n ) = u 2 + n. Brackets or parentheses can be 
used to distinguish what is in the summand of such an “anonymous function” from 
whatever is written to the immediate right of the sigma expression. They may be 
omitted when such a summand is very simple. 

3. Sometimes the property defining the indexing set is written underneath the E, as in 
the expressions J2i<k<n a k or J2keK b k- 

4. The right side of the equation J2'j= o x ’’ = X x-i 1 a c l° se d form for the sigma 
expression on the left side. 
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5. The operational meaning of the multiple sum with independent limits Yi=i Ey = 2 j 
is first to expand the inner sum, obtaining the single sum Yi=i [5 + f + |]- Expansion 
of the outer sum then yields [5 + g + j] + [§ + § + f ] + [§ + § + j] = + 

6 . The multiple sum with dependent limits Yi=i Ej=j } is evaluated by first expand- 
ing the inner sum, obtaining [i + \ + | + |] + [| + | + |] + [| + |] = 6 . 


3.5.2 ELEMENTARY TRANSFORMATION RULES FOR SUMS 

Sums can be transformed using a few simple rules. A well-chosen sequence of transfor- 
mations often simplifies evaluation. 


Facts: 


1 . Distributivity rule: Y ca k = c Y > for c a constant. 

k£K keK 

2. Associativity rule: Y (a*, + bk) = Y a k + E & fc • 

k£K keK keK 


3. Rearrangement rule: 
in K. 


E a k 

kOK 


Y a p(k) i where p is a permutation of the integers 

keK 


4. Telescoping for sequences: 


®n+l • 


n 

For any sequence { aj | j = 0, 1, . . . }, Y (°i+i 

i=m 


CLi) = 


5. Telescoping for functions: For any function /: Af — > TZ, Y A/(i) = /(n+1) — f(m). 

i—m 

6 . Perturbation method: Given a standardized sum S n = /(*), form the equation 

n n+1 n 

Y f(i ) + f( n + 1) = /( 0) + E /(*) = rn + Y /(* + 1 )- 

2=0 2=1 2=0 

Algebraic manipulation often leads to a closed form for S n . 

7. Interchanging independent indices of a double sum: When the lower and upper 
limits of the inner variable of a double sum are independent of the outer variable, the 
order of summation can be changed, simply by swapping the inner sigma, limits and 
all, with the outer sigma. That is, 

d b b d 

E E /(*> j) = E E /(//)• 

i=c j=a j=ai=c 

8 . Interchanging dependent indices of a double sum: When either the lower or upper 
limit of the inner variable j of a double sum of an expression f(i,j) is dependent on the 
outer variable i, the order of summation can still be changed by swapping the inner sum 
with the outer sum. However, the limits of the new inner variable i must be written 
as functions of the new outer variable j so that the entire set of pairs (i,j) over which 

is summed is the same as before. One particular case of interest is the interchange 

n n n j 

E E f(hj) = E E /(//)• 

2=1 j—i j = 1 2=1 


Examples: 

1. The following summation can be evaluated using Fact 4 (telescoping for sequences): 

n n 

v i = _ V (Jk k\ = 1 L_ 

^ 2 ( 2 + 1 ) V 2 +I 2 / n+1' 

2=1 2=1 
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2. Evaluate S n = Yi -0 x *> us ' n g the perturbation method. 


Ji—O 

n n+1 n+1 

x i + x n+l = x°+Y xi = i + xY x*- 1 , 
2 = 0 2=1 2=1 

n 

S n + x n+1 = l + x^2x l = l + xS n , 

2=0 


giving S n = x . 

3. Evaluate S n = X^"=o *2*, using the perturbation method. 


n+1 


Y *2* + (n + 1)2" +1 = 0 • 2° + Y *2* = E (* + l)2 i+1 , 


2=0 


2=0 


5 n + (n + l)2 n+1 = 2 £ i2* + 2 £ 2 i = 2S n + 2(2 n+1 - 1), 


2=0 


2=0 


giving S n = {n + l)2 n+1 - 2(2 n+1 - 1) = (n - l)2 n+1 + 2. 

4. Interchange independent indices of a double sum: 

3 4 4 3 4 4 4 

EEj=EEi=E[} + f + f] = Ef = 6E| = 6[ 

i=lj=2 J j=2i=l J i=2 2=2 2=2 

5. Interchange dependent indices of a double sum: 


1 + i + II 

2 ^ 3 ^ 4l 


3 3 


3 2 


EE + E E | = E7E*=r 1 + r 3 + r6 = 


-/ i z— / zl-/ i / 7 / ” 1 

2=1 j=i j = 1 2=1 j = 1 2=1 


13 

2 • 


3.5.3 ANTIDIFFERENCES AND SUMMATION FORMULAS 

Some standard combinatorial functions analogous to polynomials and exponential func- 
tions facilitate the development of a calculus of finite differences, analogous to the 
differential calculus of continuous mathematics. The fundamental theorem of discrete 
calculus is useful in deriving a number of summation formulas. 

Definitions: 

An antidifference of the function / is any function g such that A g = /, where A is 
the difference operator (§3.4.1). The notation A -1 / denotes any such function. 

The indefinite sum of the function / is the infinite family of all antidifferences of /. 
The notation Y f( x )5x + c is sometimes used for the indefinite sum to emphasize the 
analogy with integration. 


Facts: 


1. Fundamental theorem of discrete calculus: 

6+1 


E m = A~ 1 f(k)\ =A~ 1 f(b + l)-A~ 1 f(a). 

k—a ' a 

Note: The upper evaluation point is one more than the upper limit of the sum. 


2. Linearity: A 1 (af + (3g) = a A 1 f + (3 A x g, for any constants a and (3. 

3. Summation by parts: 

Y = f( b + 1 )g(b + 1 ) - f{a)g(a) - Y + !)+/(+ 

i—a i=a 

This result, which generalizes Fact 5 of §3.5.2, is a direct analogue of integration by 
parts in continuous analysis. 
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4 . Abel’s transformation: 

n n n , k \ 

E /(%(*) = f(n+ 1) E ff(fe) - E ( A /0) E 9(r))- 

k—1 k — 1 fc— 1 ' r=l ' 

5 . The following table gives the antidifferences of selected functions. In this table, H x 
indicates the harmonic sum (§3.4.1), x— is the nth falling power of x (§3.4.2), and {?} 
is a Stirling subset number (§2.5.2). 



6. The following table gives finite sums of selected functions. 


summation 


ELi k ™ 

ELo ° k 
ELi sin/c 


formula 


summation 


(»+!)— 
m+1 ’ 


m A — 1 


q n+1 -l 
a- 1 ’ 


a / 1 


sin(2ii) sin(^) 

sin (i|| 


EL ifc ro 
ELi 
ELi cos 


formula 


+ {7}(n+l)j±l 
+ j+1 

(a— l)(n+l)a n+1 — a n+2 +a 

(^Tp ' 


a ^ 1 


cos( n p ) sin(^) 
sin(|) 


Examples: 

TL TL / o 4 \ 

1. e fc 3 = EL i + 3^ + ^) = (L + ^+x) 

fc=i fc=l ' ' 

n 

2. To evaluate E k ( k + 2)(fc + 3), first rewrite its summand: 


n+1 

1 


_ n 2 (n+ 1) 2 
4 


fc=l 


E fc(fc + 2)(fc + 3) = A- 1 [(fc+l-l)(fc + 2)(fc + 3)] 


fc= l 


= [A-^jfe + S)^- A _1 (A: + 3)^] 

n+1 
1 

_ (n+4)- (n+4)- , ^ 

~ 4 3 " 1 ~ Z 

_ (n+4)(n+3)(n+2)(3n-l)+24 
— 12 


n+1 

1 

n+1 


'(k+ 3p 

P+3+ 

4 

3 J 


in+1 


3 . E fc3fc = A _1 (fc3 fe ) = 3 fe [| — |] 

fe=i 1 


n+1 


(2n-l)3 n+1 +3 

4 
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4 . Summation by parts can be used to calculate X^/=o 3 xJ ■> us i n S fti) = 3 an d Ag(j) = 
ad. Thus g(j) = x^ /{x — 1), and Fact 3 yields 


n 


E i xj 

3=0 


r, _ + ad +1 __ (?t + l)3- n + 1 + j 

(®-D U + (*-l) (*-l) *-l + 

3 = 0 3=0 

(n+l)x n+1 _ x j; T1 + 1 — 1 _ (n+l)(j;-l)a:"+ 1 -i- ,1+2 + 2 ; 
(x—1) x—1 (x—1) (x— l) 2 


5 . Summation by parts also yields an antiderivative of x3 x : 

A- 1 ^) = A^ 1 (a;A(§ • 3 X )) = \x3 x - A^i ■ 3 X+1 ■ 1) = 3 X (f - §) . 


3.5.4 STANDARD SUMS 

Many useful summation formulas are derivable by combinations of elementary manipu- 
lation and finite calculus. Such sums can be expressed in various ways, using different 
combinatorial coefficients. (See §3.1.8.) 

Definition: 

The power sum S k (n) = Ej=i E = l fc + 2 fe + 3 k + ■ ■ ■ + n k is the sum of the fcth 
powers of the first n positive integers. 

Facts: 

1. S k (n ) is a polynomial in n of degree k + 1 with leading coefficient ’ : . The contin- 
uous analogue of this fact is the familiar J^x k dx = jA_(6 fe+1 — a k+1 ). 

2. The power sum S k (n) can be expressed using the Bernoulli polynomials (§3.1.4) as 

S k (n) = ^[B k+1 (n+ 1) - B fc+1 (0)]. 

3 . When S k (n) is expressed in terms of binomial coefficients with the second entry 
fixed at k + 1, the coefficients are the Eulerian numbers (§3.1.5). 

s k (n)= E^MH^ii 1 ). 

i=0 

4 . When S k ( n ) is expressed in terms of binomial coefficients with the first entry fixed 
at n- 1-1, the coefficients are products of factorials and Stirling subset numbers (§2.5.2). 

S k (n) = E ®!{i}(i+i)- 

i= 1 

5 . Formulas for the power sums described in Facts 1, 3, and 4 are given in Tables 1-3, 
respectively, for small values of k. 


Examples: 

1. To find the third power sum S 3 (n) = Ej=i J 3 y ia Fact 2, use the Bernoulli polyno- 
mial B 4 (x) = a; 4 — 2a; 3 + x 2 — A from Table 5 of §3.1.4. Thus 


S 3 {n) = \ [B 4 (x)] 


n+1 


_ (n+1) 4 — 2 (n+1) 3 + (n+1) 2 _ n 2 (n+l) 2 
4 ~ 4 
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Table 1 Sums of powers of integers. 


summation 

formula 

E j= iJ 

\n(n + 1) 

eUj 2 

|n(n + l)(2n + 1) 

Ei=ii 3 

\n 2 {n + l) 2 

EUj 4 

^n(?i + l)(2n + l)(3n 2 + 3n — 1) 

E"=ii 5 

j^n 2 (n + 1) 2 (2 n 2 + 2n — 1) 

E"=i f 

25 n(n + l)(2n + l)(3n 4 + 6n 3 — n 2 — 3n + 1) 

eUj 7 

5 jn 2 (n + l) 2 (3n 4 + 6 n 3 — ?z 2 — 4?i + 2) 

E"=ii 8 

<^n(n + 1)(2 n + l)(5n 6 + 15n 5 + 5 n 4 — 15n 3 — n 2 + 9n — 3) 

EUf 

^n 2 (n + 1) 2 (2 n 6 + 6 n 5 + n 4 — 8?i 3 + n 2 + 6n — 3) 


Table 2 Sums of powers and Eulerian numbers. 


summation 

formula 

EUf 
EUf 
EUj 4 
E"=i i 5 

( n V) 

m + m 

rr)+4( n t 2 )+rr) 

r^+nm+nm+rr) 

("+ 1 ) + 26 (”+ 2 ) + 66 (”+ 3 ) + 26 (”+ 4 ) + (”+ 5 ) 


Table 3 Sums of powers and Stirling subset numbers. 


summation 

formula 

v — 

Ej=iJ 

E U? 
EU? 

E U? 

Ej=i f 

m 

(T) + 2(T) 
m+ern+err) 

("+ 1 ) + M^ 1 ) + se^j 1 ) + 24(”+ 1 ) 

("+ 1 ) + 30 ("+ 1 ) + 150 ("+ 1 ) + 240 ("+ 1 ) + 120 ("+ 1 ) 
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2 . Power sums can be found using antidifferences and Stirling numbers of both types. 

For example, to find S 3 (n) = Y^x - 1 x3 first compute 

A- v = A-1 ({?}*! + + (1)^) = f ^ 

Each term x— is then expressed in terms of ordinary powers of x 
x- = [2 ]x 2 — [l]x 3 = x 2 — X, 
x- = [|] a; 3 — [,] x 2 + [^a: 1 = x 3 — 3x 2 + 2x, 
x- = [^] x A — [3] x 3 + [2] x 2 — [i^] a: 1 = x 4 — 6a; 3 + 11a; 2 — 6a;, 

so A -1 a: 3 = ^(a; 2 — x) + ( x 3 — 3a; 2 + 2a;) + \{x A — 6a; 3 + 11a; 2 — 6x) = \{x A — 2x 3 + x 2 ). 

Evaluating this antidifference between the limits x = 1 and x = n + 1 gives S 3 (n) = 

\n 2 {n + l) 2 . See §3.5.3, Fact 1. 


3.6 ASYMPTOTICS OF SEQUENCES 


An exact formula for the terms of a sequence may be unwieldy. For example, it is diffi- 
cult to estimate the magnitude of the central binomial coefficient ( 2 J ) 1 ) = from the 
definition of the factorial function alone. On the other hand, Stirling’s approximation 
formula (§3.6.2) leads to the asymptotic estimate In applying asymptotic analy- 
sis, various “rules of thumb” help bypass tedious derivations. In practice, these rules 
almost always lead to correct results that can be proved by more rigorous methods. In 
the following discussions of asymptotic properties, the parameter tending to infinity is 
denoted by n. Both the subscripted notation a n and the functional notation f(n) are 
used to denote a sequence. The notation f(n) ~ g(n) (/ is asymptotic to g) means that 
f(ri) yf 0 for sufficiently large n and linin^oo = 1. 


3.6.1 APPROXIMATE SOLUTIONS TO RECURRENCES 

Although recurrences are a natural source of sequences, they often yield only crude 
asymptotic information. As a general rule, it helps to derive a summation or a generating 
function from the recurrence before obtaining asymptotic estimates. 

Facts: 

1. Rule of thumb: Suppose that a recurrence for a sequence a n can be transformed 
into a recurrence for a related sequence b n , so that the transformed sequence is approx- 
imately homogeneous and linear with constant coefficients (§3.3). Suppose also that p 
is the largest positive root of the characteristic equation for the homogeneous constant 
coefficient recurrence. Then it is probably true that ~ p\ i.e., b n grows roughly 
like p n . 

2 . Nonlinear recurrences are not covered by Fact 1. 

3 . Recurrences without fixed degree such as divide-and conquer recurrences (§3.3.5), 
in which the difference between the largest and smallest subscripts is unbounded, are 
not covered by Fact 1. See [GrKn90, Ch. 2] for appropriate techniques. 
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Examples: 

1. Consider the recurrence D n+ 1 = n(D n + D n _i) for n > 1, and define d n = . Then 

d„+i = + ^rj-dn-i, which is quite close to the constant coefficient recurrence 

d n+ 1 = d n . Since the characteristic root for this latter approximate recurrence is p = 1, 
Fact 1 suggests that ~ 1, which implies that d n is close to constant. Thus, we 
expect the original variable D n to grow like n\. Indeed, if the initial conditions are 
Dq = D\ = 1, then D n = n\. With initial conditions Dq = 1, Di = 0, then D n is the 
number of derangements of n objects (§2.4.2), in which case D n is the closest integer 
to — for n > 1. 

e — 

2. The accuracy of Example 1 is unusual. By way of contrast, the number /„ of 
involutions of an n-set (§2.8.1) satisfies the recurrence I n +i = I n + nl n - ± for n > 1 with 
lo = h = 1- By defining i n = ©/(n!) 1 / 2 , then 

A _ in I ‘n-l 

ln + 1 ~ (n+1) 1 / 2 “ r (1+1/n) 1 / 2 ’ 

which is nearly the same as the constant coefficient recurrence i n + i = i n - 1 - The charac- 
teristic equation p 2 = 1 has roots ±1, so Fact 1 suggests that i n is nearly constant and 
hence that /„ grows like y/rd. The approximation in this case is not so good, because 
J„/Vnl ~ e'/”/(87ren) 1 / 4 , which is not a constant. 


3.6.2 ANALYTIC METHODS FOR DERIVING ASYMPTOTIC ESTIMATES 

Concepts and methods from continuous mathematics can be useful in analyzing the 
asymptotic behavior of sequences. 

Definitions: 

The radius of convergence of the series ^ a n x n is the number r such that the series 
converges for all |x| < r and diverges for all \x\ > r, where 0 < r < oo. 

The gamma function is the function T(x) = t x ~ 1 e~ t dt. 

Facts: 

1. Stirling’s approximation : n\ ~ m(^) n . 

2. T(x + 1) = xT(a;), T(n + 1) = n\, and T(|) = yfn. 

3. The radius of convergence of Yla n x n is given by \ = limsup^^^ lanl 1 /™. 

4. From Fact 3, it follows that \a n \ tends to behave like r~ n . Most analytic methods 
are refinements of this idea. 

5. The behavior of f{z) near singularities on its circle of convergence determines the 
dominant asymptotic behavior of the coefficients of /. Estimates are often based on 
Cauchy’s integral formula : a n = § f{z)z ~ n ~ 1 dz. 

6. Rule of thumb: Consider the set of values of x for which f(x) = a n x n is either 
infinite or undefined, or involves computing a nonintegral power of 0. The absolute 
value of the least such x is normally the radius of convergence of /( x). If there is no 
such x, then r = oo. 

7. Rule of thumb: Suppose that 0 < r < oo is the radius of convergence of f(x), that 
g(x ) has a larger radius of convergence, and that 

f(x) - g(x) ~ A(-ln(l - f)) b (l - as x -* r~ 

for some constants A , b , and c, where it is not the case that both b = 0 and c is a 
nonnegative integer. (Often g{ x) = 0.) Then it is probably true that 
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A( n ° ^(lnn^r n , if c ^ 0, 
Ab(\nn) b ~ 1 /n, if c = 0. 


8. Rule of thumb: Let a(x) = and b(x) = < j^ c - Suppose that a{r n ) = n has 

a solution with 0 < r n < r and that b(r n ) G o(n 2 ). Then it is probably true that 

f(r n )r~ n 

y/27r6(r„) 

Examples: 

1. The number D n of derangements has the exponential generating function f{x) = 
D n ^j = Since evaluation for x = 1 involves division by 0, it follows that r = 1. 

Since ~ fzj as x — > 1“, take g(x) = 0, A = e _1 , 6 = 0, and c = — 1. Fact 7 
suggests that D n ~ — , which is correct. 

2. The number b n of left-right binary n-leaved trees has the generating function f(x) = 
\ (l — y/1 — 4x). (See §9.3.3, Facts 1 and 7.) In this case r = \ since /(|) requires 
computing a fractional power of 0. Take g(x) = ^, A = b = 0, and c = ^ to suspect 
from Fact 7 that 

h — r ( Tt ~ s) 4 ” 4- 1 

n 2 \ y 2r(n+ l)r(-4) 

which is valid. (Facts 1 and 2 have also been used.) This estimate converges rather 
rapidly — by the time n = 40, the estimate is less than 0.1% below 640. 

3. Since ]T) = e x , n\ can be estimated by taking a(x) = b{x) = x and r n = n in 

1 e n n~ n 

Fact 8. This gives — ~ . , which is Stirling’s asymptotic formula. 

tu V27rn 

4. The number B n of partitions of an n-set (§2.5.2) satisfies = exp(e a: — 1). 

In this case, r = 00. Since a(x) = xe x and b(x) = x(x + l)e x , it follows that r n is the 
solution to r n exp(r„) = n and that b(r n ) = (■ r n + 1 )n ~ nr n G o(n 2 ). Fact 8 suggests 

R n! exp(e r ” — 1) n! exp (n/r n — 1) 
r%^/2Trnr n r^\/2nnr n 

This estimate is correct, though the estimate converges quite slowly, as shown in this 
table: 


n 

10 

20 

100 

200 

estimate 

1.49 x 10 5 

6.33 x 10 13 

5.44 x 10 115 

7.01 x 10 275 

B n 

1.16 x 10 5 

5.17 x 10 13 

4.76 x 10 115 

6.25 x 10 275 

ratio 

1.29 

1.22 

1.14 

1.12 


Improved asymptotic estimates exist. 

5. Analytic methods can sometimes be used to obtain asymptotics when only a func- 
tional equation is available. For example, if a n is the number of n-leaved rooted trees in 
which each non-leaf node has exactly two children (with left and right not distinguished), 
the generating function for a n satisfies f(x) = x + ( f(x ) 2 + f(x 2 )) /2, from which it can 
be deduced that a n ~ Cn~ 3 ^ 2 r~ n , where r = 0.4026975 . . . and C = 0.31877 . . . can 
easily be computed to any desired degree of accuracy. See [BeWi91, p. 394] for more 
information. 
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3.6.3 ASYMPTOTIC ESTIMATES OF MULTIPLY-INDEXED SEQUENCES 

Asymptotic estimates for multiply-indexed sequences are considerably more difficult to 
obtain. To begin with, the meaning of a formula such as 

/ n\ 2" exp (—(re — 2k) 2 / {2 n)) 

\k) yj Tin/ 2 

must be carefully stated, because both n and k are tending to oo, and the formula is 
valid only when this happens in such a way that \2n — k\ G o(n 3//4 ). 

Facts: 

1. Very little is known about how to obtain asymptotic estimates from multiply-indexed 
recurrences. 

2. Most estimates of multiple summations are based on summing over one index at a 
time. 

3. A few analytic results are available in the research literature. (See [Od95].) 


3.7 MECHANICAL SUMMATION PROCEDURES 

This section describes mechanical procedures that have been developed to evaluate 
sums of terms involving binomial coefficients and related factors. These procedures can 
not only be used to find explicit formulas for many sums, but can also be used to show 
that no simple closed formulas exist for certain sums. The invention of these mechanical 
procedures has been a surprising development in combinatorics. The material presented 
here is mostly adapted from [PeWiZe96], a comprehensive source for material on this 
topic. 


3.7.1 HYPERGEOMETRIC SERIES 


Definitions: 


A geometric series is a series of the form Y^k = o ak w h ere the ratio between two 
consecutive terms is a constant, i.e. , where the ratio °' k+1 is a constant for all k = 
0 , 1 , 2 ,.... 

A hypergeometric series is a series of the form t k where to = 1 and the ratio 

of two consecutive terms is a rational function of the summation index k, i.e., the ratio 
t ~lT~ = Cpd w ^ iere -P(^) and Q{k) are polynomials in the integer k. The terms of a 
hypergeometric series are called hypergeometric terms. 


When the numerator P{k) and denominator Q{k) of this ratio are completely factored 
to give 


P{k) _ ( k + a\)(k + a 2 ) . . . (k + a p ) 

Q{k) ~ (k + b 1 )(k + b 2 )...(k + b q )(k + l) 
where a: is a constant, this hypergeometric series is denoted by 

b 2 ... b 
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Note : If there is no factor k + 1 in the denominator Q(k) when it is factored, by 
convention the factor fc+1 is added to both the numerator P{k ) and denominator Q(k). 
Also, a horizontal dash is used to indicate the absence of factors in the numerator or in 
the denominator. 

The hypergeometric terms s n and t n are similar , denoted s n ~ t n , if their ratio s n /t n 
is a rational function of n. Otherwise, these terms are called dissimilar . 


Facts: 


1. A geometric series is also a hypergeometric series. 

2. If s n is a hypergeometric term, then j- is also a hypergeometric term. (Equivalently, 
if s n is a hypergeometric series, then YlkLo i~~ a ^ so i s 0 

3. In common usage, instead of stating that the series ]Cfclo Sn a hypergeometric 
series, it is stated that s n is a hypergeometric term. This means exactly the same thing. 

4. If s n and t n are hypergeometric terms, then s n -t n is a hypergeometric term. (Equiv- 
alently, if Y^k = o s n an( l o tn are hypergeometric series, then YlT=o Sn ^ n a hyP er ~ 
geometric series.) 

5. If s n is a hypergeometric term and s n is not a constant, then s n +i — s n is a hyper- 
geometric term similar to s n . 

6. If s n and t n are hypergeometric terms and s n + t n 0 for all n, then s n + t n is 
hypergeometric if and only if s n and t n are similar. 

7. If tn\,tn \ ■ ■ ■ ,tn^ are hypergeometric terms with Yli= l = 0, then ~ for 

some i and j with 1 < i < j < k. 

8. A sum of a fixed number of hypergeometric terms can be expressed as a sum of 
pairwise dissimilar hypergeometric terms. 

9. The terms of a hypergeometric series can be expressed using rising powers a n (also 
known as rising factorials and denoted by (a)„) (see §3.4.2) as follows: 

to {b 1 f{b 2 f...{b q f fc! ' 

10. There are a large number of well-known hypergeometric identities (see Facts 12- 
17, for example) that can be used as a starting point when a closed form for a sum of 
hypergeometric terms is sought. 


T7 1 — 

p r q ~ 


a i a 2 
bo 


11. There are many rules that transform a hypergeometric series with one parameter 
set into a different hypergeometric series with a second parameter set. Such transfor- 
mation rules can be helpful in constructing closed forms for sums of hypergeometric 
terms. 


12. X F X 


13. i F(j 



1 

(1 — x) a 


14. Gauss’s 2 -Fi identity: If b is zero or a negative integer or the real part oi c — a — b 
is positive, then 


2 Pi 


a 

c 



where T is the gamma function (so T(n) 


T(c — a — 6)T(c) 

T(c — a)T(c — b) 

(n — 1)! when n is a positive integer). 


15. 


Kummer’s 2 F\ identity: If a — b + c = 1, then 


2 Pi 


a 

c 



r(| + i)r(6 — a + 1) 

r(6+l)T(| -0+1) 
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and when b is a negative integer, this can be expressed as 


2F 1 


-1 


L C 


2cos(f ) r (|y r( -fe — 

2 'r(^)r(§ -o + i) 


16. Saalschiitz’s 3 F 2 identity : If d + e = a + 6 + c+ l and c is a negative integer, then 


3 F 2 


Id 




(d-q)l c l(rf-&)H 
dl c l(d — a — b) l c l 


17. Dixon’s identity : If 1 + | — b — c>0, d = a — 6+1, and e = a — c + 1, then 


3 -C 2 


1 


— 12 . 


(|)!(a — b)!(a — c)!(| — b — c)\ 
o!(f -&)!(§ -c)!(a-b-c)! ' 


The more familiar form of this identity reads 


V t-D k ( a + b )f a + c )( b + c ) 
^ k ' \a + k J \c+ k ) \b + k ) 


(a + b + c)\ 
a\b\c\ 


18. Clausen’s 4 F 3 identity: If d is a negative integer or zero and a + b + c— d = 
e = a + b + \ , and a + f = d+ l = b+ g, then 


4 F 3 


a 

e 


bed' 

f 9 5 J 


(2a) |d| (q+b) |d| (2fc) |d| 
(2a+26)l d l (a)l d l (6)l d l . 


Examples: 

1. The series Y^kLo 3 ' ( — 5) fc is a geometric series. The series Y^kLo n ^ n n °t a S eo “ 
metric series. 

2. The series a hypergeometric series when tk equals 2 fc , (k + l) 2 , or 

( 2 fc+i) 1 (fc+ 3 )! > but n °t hypergeometric when t± = 2 fc + 1. 


3. The series £~ =0 If equals 0 -P 3 
terms is 


111 


;3 


since the ratio of the (fc+l)st and fcth 


(fc+1) 4 


4. A closed form for S n = l) fc Ck) 2 can be f° un d by first noting that S n = 

2n —2 n 


2^1 


1 


;-i 


since the ratio between successive terms of the sum is (k+rp • 


This shows that Rummer’s 2 -Pi identity can be invoked with a = —2 n, b = —2 n, and 
c = 1, producing the equality S„ = 2( ~ w V (n (2 ?)T 1)! = 

5. An example of a transformation rule for hypergeometric functions is provided by 


2-Fl 


= (1 - x) 


c—a—b rp 

2-ri 


c — a c — b 


3.7.2 ALGORITHMS THAT PRODUCE CLOSED FORMS FOR SUMS OF HYPERGEOMET- 
RIC TERMS 


Definitions: 


A function F(n,k) is called doubly hypergeometric if both and 

are rational functions of n and k. 


A function F(n, k) is a proper hypergeometric term if it can be expressed as 


F(n, k) = P{n, k) 


U.f=i( a i n + hk + Ci)\ k 

TJ *F 

rii=i (Uin + Vik + Wi)\ 


where a: is a variable, P{n , k) is a polynomial in n and k, G and H are nonnegative 
integers, and all the coefficients a t , bi, Ui, and u, are integers. 
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A function F(n, k) of the form 


G 

II ( a i n + bik + a)\ 

F(n, k) = P{n, k) x k 

n (■ u i n + Vik + w iV 

i—1 

is said to be well-defined at (n, k) if none of the terms (a,n + bik + c*) in the product 
is a negative integer. The function F(n, k) is defined to have the value 0 if F is well- 
defined at (n, k) and there is a term (iqn + Vik + to*) in the product that is a negative 
integer or P(n, k ) = 0. 


Facts: 

1. If F(n,k) is a proper hypergeometric term, then there exist positive integers L 
and M and polynomials aij(n) for i = 0, 1, . . . , L and j = 0,1,..., M, not all zero, such 
that 

L M 

E E a i,j( n ) F ( n - h k-i)= 0 

2=0 j — 0 

for all pairs (n, k) with F(n, k) ^ 0 and all the values of F(n, k) in this double sum 
are well-defined. Moreover, there is such a recurrence with M equal to M' = )T) S b s | + 
J2 t |u t | and L equal to L' = deg(P) + 1 + M'(— 1 + J2 S l a «l + E t l u *|) 5 where the a t , b it 
Ui, Vi and P come from an expression of F(n, k) as a hypergeometric term as specified 
in the definition. 

2. Sister Celine’s algorithm : This algorithm, developed in 1945 by Sister Mary Celine 
Fasenmeyer (1906-1996), can be used to find recurrence relations for sums of the form 
f(n) = Efc F{n, k) where F is a doubly hypergeometric function. The algorithm finds 
a recurrence of the form Yld = o Ejlo a ij( n ) F ( n ~ k — i) = 0 by proceeding as follows: 

• start with trial values of L and M, such as L = 1, M = 1; 

• assume that a recurrence relation of the type sought exists with these values of L 

and M, with the coefficients a h j(n) to be determined, if possible; 

• divide each term in the sum of the recurrence by F(n, k), then reduce each fraction 

F(n — j,k — i)/F(n,k), simplifying the ratios of factorials so only rational 
functions of n and k are left; 

• combine the terms in the sum using a common denominator, collecting the nu- 

merator into a single polynomial in k: 

• solve the system of linear equations for the a, j (n) that results when the coeffi- 

cients of each power of k in the numerator polynomial are equated to zero; 

• if these steps fail, repeat the procedure with larger values of L and M; by Fact 2, 

this procedure is guaranteed to eventually work. 

3. Gosper’s algorithm : This algorithm, developed by R. W. Gosper, Jr., can be used to 
determine, given a hypergeometric term t n , whether there is a hypergeometric term z n 
such that z n+ i — z n = t n . When there is such a hypergeometric term z n , the algorithm 
also produces such a term. 

4. Gosper’s algorithm takes a hypergeometric term t n as input and performs the fol- 
lowing general steps (for details see [PeWiZe96]): 

• let r(n) = t n +\/t n ; this is a rational function of n since t is hypergeometric; 

• find polynomials a(n), b(n), and c(n) such that gcd(a(n), b(n+h)) = 1 whenever h 

is a nonnegative integer; this is done using the following steps: 
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o let r(n) = K ■ where /(n) and g{n) are monic relatively prime polyno- 
mials and I\ is a constant, let R(h) be the resultant of f(n) and g{n + h) 
(which is the product of the zeros of g(n + h) at the zeros of f(n)), and let 
S = {hi, h 2 , . . . , /ijv} be the set of nonnegative integer zeros of R(h) where 
0 < hi < h 2 <■■■ < h N ; 

o let po(n) = fin) and qo(n) = gin)] then for j = 1,2 , . . . ,N carry out the 
following steps: 

Sj(n) := gcd(pj-i(n),qj-i(n + hj)) 

Pj i n ) : = Pj-i(n) / Sjin) 
q 3 {n) := qj-i{n) / Sj{n - hj); 

• take a{n) := I\p N (n ); b(n) := qw©)] c{n) := flili TijL\ s ii n ~ j)\ 

• find a nonzero polynomial x(n) such that a{n)x{n + 1) — b(n — l)x(n) = c(n) if 

one exists; such a polynomial can be found using the method of undetermined 
coefficients to find a nonzero polynomial of degree d or less, where the degree d 
depends on the polynomials a(n), b{n), and c(n). If no such polynomial exists, 
then the algorithm fails. The degree d is determined by the following rules: 
o when deg a(n) / deg b(n) or deg a(n) = deg b(n) but the leading coefficients 
of a(n) and b(n) differ, then d — deg c(n) — max(deg a(n), deg b(n)); 
o when deg a(n) = deg b(n) and the leading coefficients of a(n) and b(n) agree, 
d = max(degc(n) — dega(n) + l, ( B — A)/L ) where a(n) = Ln k + An k ~ 1 + - ■ ■ 
and b{n — 1) = Ln k + Bn k ~ x + • ■ ■ ; if this d is negative, then no such 
polynomial x{n) exists; 

• let z n = t n ■ b(n — 1 )x(n)/c(n); it follows that z n + \ — z n = t n . 

5. When Gosper’s algorithm fails, this shows that a sum of hypergeometric terms 
cannot be expressed as a hypergeometric term plus a constant. 

6. Programs in both Maple and Mathematica implementing algorithms described in 
this section can be found at the following sites: 

http : //www. cis .upenn. edu/^wilf /AeqB .html 

http : // www . math . temple . edu/ ~zeilberg 


Examples: 

1. The function F(n , k) = 5n _J> k+2 a P r0 P er hypergeometric term since F{n, k) can 
be expressed as F(n, k) = (s^fe+^jl - 

2. The function F{n,k) = ti2+ ^, 3+5 is not a proper hypergeometric term. 

3. Sister Celine’s algorithm can be used to find a recurrence relation satisfied by the 

function /(n) = k) where F{n,k) = fc(£) for n = 0,1,2, The algorithm 

proceeds by finding a recurrence relation of the form a{ri)F{n , k) + b{n)F{n + l,/c) + 
c(n)F(n, k+ 1) + d(ri)F(n+ 1, k+ 1) = 0. Since F{n,k) = fc(^), this recurrence relation 
simplifies to a(n) + b{n) ■ + c(n) • + d{n) ■ = 0. Putting the left side of 

this equation over a common denominator and expressing it as a polynomial in k , four 
equations in the unknowns a(n), b(n), c(n), and d(n) are produced. These equations 
have the following solutions: a(n) = t(— 1 — i), b(n) = 0, c(n) = t(— 1 — -), d = t, 
where t is a constant. This produces the recurrence relation (—1 — -)F(n, k ) + (—1 — 
^ )F(n,k + 1) + F(n + 1, k + 1) = 0, which can be summed over all integers k and 
simplified to produce the recurrence relation f(n +1) = 2 • ^^/(n), with /( 1) = 1. 
From this it follows that f(n) = n2 n ~ l . 


© 2000 by CRC Press LLC 


4. As shown in [PeWiZe96], Sister Celine’s algorithm can be used to find an identity for 
f(n) = J2k F( n ,k) where F(n,k) = (^) ( 2 ^) (— 2)" _fc . A recurrence for F(n,k) can be 
found using her techniques (which can be carried out using either Maple or Mathematica 
software, for example). An identity that can be found this way is: — 8(n— l)F(n — 2, k — 
l)-2(2n-l)F(n-l,k-l)+A(n-l)F(n-2,k)+2(2n-l)F(n-l,k)+nF(n,k) = 0. When 
this is summed over all integers k, the recurrence relation n/(n) — 4 (n — 1 )/(n — 2) = 0 
is obtained. From the definition of / it follows that /( 0) = 1 and /( 1) = 0. From the 
initial conditions and the recurrence relation for /(n), it follows that f(n) = 0 when n 
is odd and f(n) = ( n ™ 2 ) when n is even. (This is known as the Reed-Dawson identity.) 

5. Gosper’s algorithm can be used to find a closed form for S n = J2k = i k ' Let 

t n = n- n\. Following Gosper’s algorithm gives r(n) = a(n ) = n + 1, 

b(n) = 1, and c(n) = n. The polynomial x(n) must satisfy (n+l)x(n+l) — x(n) = n; the 
polynomial x(n) = 1 is such a solution. It follows that z n = n\ satisfies z n + 1 — z n = t n . 
Hence s n = z n — Z\ = n\ — 1 and S n = s„+i = (n + 1)! — 1. 

6. Gosper’s algorithm can be used to show that S n = J2k=o cannot be expressed 

as a hypergeometric term plus a constant. Let t n = n\. Following Gosper’s algorithm 
gives r{n) = = n+ 1, a{n) = n+ 1, b(n) = 1, c(n) = 1. The polynomial x{n) must 

satisfy (n+ l)x(n+ 1) — x(n) = 1 and must have a degree less than zero. It follows that 
there is no closed form for o °f the type specified. 


3.7.3 CERTIFYING THE TRUTH OF COMBINATORIAL IDENTITIES 


Definitions: 

A pair of functions (F, G) is called a WZ pair (after Wilf and Zeilberger) if F(n + 
1, k) — F(n , k ) = G{n, k + 1) — G(n, k). If ( F , G) is a WZ pair, then F is called the WZ 
mate of G and vice versa. 

A WZ certificate R(n, k) is a function that can be used to verify the hypergeometric 
identity f(n, k) = r(n) by creating a WZ pair (F) G) with F(n, k) = when 

r(n) ^ 0 and F(n,k) = f(n,k) when r(n) = 0 and G(n,k) = R(n, k)F(n, k). When a 
hypergeometric identity is proved using a a WZ certificate, this proof is called a WZ 

proof. 

Facts: 

1. If (F, G ) is a WZ pair such that for each integer n > 0, lim^-too G(n, k) = 0, then 

k) is a constant for n = 0, 1, 2, 

2. If (F, G) is a WZ pair such that for each integer k, the limit fk = lim^oo F(n, k) 
exists and is finite, for every nonnegative integer n it is the case that lim^-i-oo G(n , k ) = 
0, and limbos ^„> 0 G(n, —L) = 0, then J2 n >o G ( n > k ) = Ej<fc-i(/i ~ ^(0. i))- 

3. An identity f(n, k) = r(n) can be verified using its WZ certificate R{n , fc) as 
follows: 

• if r(n) ^ 0, define F(n,k) by F(n,k) = yyiyy, else define F(n,k ) = f(n,k) ; 

define G(n, fc) by G(n, fc) = I?(n, k)F[n, k); 

• confirm that (F, G) is a WZ pair, i.e., that F(n + 1, fc) — F(?i, fc) = G(n, fc + 1) — 

G(n, k), by dividing the factorials out and verifying the polynomial identity 
that results; 

• verify that the original identity holds for a particular value of n. 
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4. The WZ certificate of an identity J2k f( n > = r ( n ) can be found using the following 

steps: 

• if r(n) yf 0, define F(n,k) to be F{n,k) = , else define F{n,k) to be 

= f(n, k)\ 

• let /(fc) = F(n + 1, fc) — F(n, k); provide f(k) as input to Gosper’s algorithm; 

• if Gosper’s algorithm produces G(n, k) as output, it is the WZ mate of F and the 

function R(n, k) = is the WZ certificate of the identity F{n, k) = C 

where C is a constant. 

If Gosper’s algorithm fails, this algorithm also fails. 


Examples: 

1. To prove the identity f(n) = J2k (£)“ = ( 2 "), express it in the form F(n, k ) = 1 
where F(n, k) = (^) 2 / ( 2 ™) ■ The identity can be proved by taking the function R{n, k) = 
2(2n+i)(n-t+i ) 2 as certificate. (This certificate can be obtained using Gosper’s 

algorithm.) 


2. To prove Gauss’s 2 F 1 identity via a WZ proof, express it in the form J2 k F( n > k) = 1 
where F(n,k) = ■ Th e identity can then be proved by 

taking the function i?(n, k) = ^ntn+i-c) as certificate. (This certificate can be 

obtained using Gosper’s algorithm.) 
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INTRODUCTION 

This chapter covers the basics of number theory. Number theory, a subject with a 
long and rich history, has become increasingly important because of its applications to 
computer science and cryptography. The core topics of number theory, such as divisibil- 
ity, radix representations, greatest common divisors, primes, factorization, congruences, 
diophantine equations, and continued fractions are covered here. Algorithms for finding 
greatest common divisors, large primes, and factorizations of integers are described. 

There are many famous problems in number theory, including some that have 
been solved only recently such as Fermat’s Last Theorem, and others that have eluded 
resolution, such as the Goldbach conjecture. The status of such problems is described 
in this chapter. New discoveries in number theory, such as new large primes, are being 
made at an increasingly fast pace. This chapter describes the current state of knowledge 
and provides pointers to Internet sources where the latest facts can be found. 


GLOSSARY 

algebraic number: a root of a polynomial with integer coefficients. 

arithmetic function: a function defined for all positive integers. 

Bachet’s equation: a diophantine equation of the form y 2 = x 3 + k , where k is a 
given integer. 

base: the positive integer 6, with b > 1, in the expansion n = dkb k + afc_ i6 fc_1 + • • • + 
a\b + ao where 0 < cq < b — 1 for i = 0, 1, 2, . . . , k. 

binary coded decimal expansion: the expansion produced by replacing each deci- 
mal digit of an integer by the four-bit binary expansion of that digit. 

binary representation of an integer: the base two expansion of this integer. 

Carmichael number: a positive integer that is a pseudoprime to all bases. 


© 2000 by CRC Press LLC 


Catalan’s equation: the diophantine equation x m —y n — 1 where solutions in integers 
greater than 1 are sought for x, y, to, and n. 

Chinese remainder theorem: the theorem that states that given a set of congruences 
x = a,i (mod rrij) for i = 1 , 2 , . . . , n where the integers to , , i = 1 , 2 , . . . , n, are pairwise 
relatively prime, there is a unique simultaneous solution of these congruences modulo 
M = mim-2 . . • m n . 

complete system of residues modulo m: a set of integers such that every integer 
is congruent modulo m to exactly one integer in the set. 

composite: a positive integer that has a factor other than 1 and itself. 

congruence class of a modulo m: the set of integers congruent to a modulo m. 

congruent integers modulo m: two integers with a difference divisible by to. 

convergent: a rational fraction obtained by truncating a continued fraction. 

continued fraction: a finite or infinite expression of the form ao + l/(ai + 1/(02 H ; 

usually abbreviated [ao, ai, 02 , . . .]. 

coprime (integers): integers that have no positive common divisor other than 1; see 
relatively prime. 

Dedekind sum: the sum s(h, k) = ((q^)) ((*)) where ((a;)) = x — [arj — \ if x 

is not an integer and ((x)) = 0 if x is an integer. 

diophantine approximation: the approximation of a number by numbers belonging 
to a specified set, often the set of rational numbers. 

Diophantine equation: an equation together with the restriction that the only solu- 
tions of the equation of interest are those belonging to a specified set, often the set 
of integers or the set of rational numbers. 

Dirichlet’s theorem on primes in arithmetic progressions: the theorem that states 
that there are infinitely many primes in each arithmetic progression of the form 
an + b where a and b are relatively prime positive integers. 

discrete logarithm of a to the base r modulo m: the integer x such that r x = 
a (mod to), where r is a primitive root of to and gcd(a, to) = 1. 

divides: The integer a divides the integer 6, written a \ b, if there is an integer c such 
that b = ac. 

divisor: (1) an integer d such that d divides a for a given integer a, or (2) the positive 
integer d that is divided into the integer a to yield a = dq + r where 0 < r < d. 

elliptic curve: for prime p > 3, the set of solutions ( x,y ) to the congruence y 2 = 
x 3 + ax + b (mod p), where 4a 3 + 27 b 2 / 41 (mod p), together with a special point O, 
called the point at infinity. 

elliptic curve method ( ECM ): a factoring technique invented by Lenstra that is 
based on the theory of elliptic curves. 

Euler phi-function: the function <j>(n) whose value at the positive integer n is the 
number of positive integers not exceeding n relatively prime to n. 

Euler’s theorem: the theorem that states that if n is a positive integer and a is an 
integer with gcd(a,?r) = 1, then a^ n ’ = 1 (mod n) where <j)(n) is the value of the 
Euler phi-function at n. 

exactly divides: If p is a prime and n is a positive integer, p r exactly divides n, 
written p r \\n, if p r divides n, but p r+l does not divide n. 
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factor (of an integer n): an integer that divides n. 

factorization algorithm: an algorithm whose input is a positive integer and whose 
output is the prime factorization of this integer. 

Farey series (of order n): the set of fractions h . where h and k are relatively prime 
nonnegative integers with 0 < h < k < n and k ^ 0. 

Fermat equation : the diophantine equation x n +y n = z n where n is an integer greater 
than 2 and x, y, and z are nonzero integers. 

Fermat number: a number of the form 2 2 +1 where n is a nonnegative integer. 

Fermat prime: a prime Fermat number. 

Fermat’s last theorem: the theorem that states that if n is a positive integer greater 
than two, then the equation x n + y n = z n has no solutions in integers with xyz ^ 0. 

Fermat’s little theorem: the theorem that states that if p is prime and a is an 
integer, then a p = a (mod p). 

Fibonacci numbers : the sequence of numbers defined by Fq = 0, F\ = 1, and F n = 
F n - 1 + F n _ 2 for n = 2, 3, 4, ... . 

fundamental theorem of arithmetic: the theorem that states that every positive 
integer has a unique representation as the product of primes written in nondecreasing 
order. 

Gaussian integers: the set of numbers of the form a + bi where a and b are integers 
and i is \/— T. 

greatest common divisor (gcd) of a set of integers: the largest integer that divides 
all integers in the set. The greatest common divisor of the integers aq, < 22 , . . . , a n is 
denoted by gcd(ai, a ^, . . . , a n ). 

hexadecimal representation (of an integer) : the base sixteen representation of this 
integer. 

index of a to the base r modulo m: the smallest nonnegative integer x, denoted 
ind r a, such that r x = a (mod to), where r is a primitive root of m and gcd(a, m) = 1. 

inverse of an integer a modulo m: an integer a such that aa = 1 (mod m). Here 
gcd(a, to) = 1. 

irrational number: a real number that is not the ratio of two integers. 

Jacobi symbol: a generalization of the Legendre symbol. (See §4.7.3.) 

Kronecker symbol: a generalization of the Legendre and Jacobi symbols. (See 
§4.7.3.) 

least common multiple (of a set of integers): the smallest positive integer that is 
divisible by all integers in the set. 

least positive residue of a modulo m: the remainder when a is divided by m. It 
is the smallest positive integer congruent to a modulo m, written a mod m. 

Legendre symbol: the symbol ©) that has the value 1 if a is a square modulo p 
and —1 if a is not a square modulo p. Here p is a prime and a is an integer not 
divisible by p. 

linear congruential method: a method for generating a sequence of pseudo-random 
numbers based on a congruence of the form x n+ i = ax n + c (mod to). 

Mersenne prime: a prime of the form 2 P — 1 where p is a prime. 
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Mobius function: the arithmetic function p(n) where p(n) = 1 if n = 1, /i(?i) = 0 
if n has a square factor larger than 1, and p(n) = (— l) s if n is square-free and is the 
product of s different primes. 

modulus: the integer m in a congruence a = b (mod m). 

multiple of an integer a: an integer b such that a divides b. 

multiplicative function: a function / such that f(mn ) = whenever m 

and n are relatively prime positive integers. 

mutually relatively prime set of integers: integers with no common factor greater 
than 1. 

number Held sieve: a factoring algorithm, currently the best one known for large 
numbers with no small prime factors. 

octal representation of an integer: the base eight representation of this integer. 

one’s complement expansion: an n bit representation of an integer x with x < 
2" _1 , where n is a specified positive integer, where the leftmost bit is 0 if x > 0 
and 1 if x < 0, and the remaining n — 1 bits are those of the binary expansion of x 
if x > 0, and the complements of the bits in the expansion of \x\ if x < 0. 

order of an integer a modulo m: the least positive integer t, denoted by ord m a, 
such that a* = 1 (mod to). Here gcd(a,m) = 1. 

pairwise relatively prime: integers with the property that every two of them are 
relatively prime. 

palindrome: a finite sequence that reads the same forward and backward. 

partial quotient: a term a, of a continued fraction. 

Pell’s equation : the diophantine equation x 2 — dy 2 = 1 where d is a positive integer 
that is not a perfect square. 

perfect number: a positive integer whose sum of positive divisors, other than the 
integer itself, equals this integer. 

periodic base b expansion: a base b expansion where the terms beyond a certain 
point are repetitions of the same block of integers. 

powerful integer: an integer n with the property that p 2 divides n whenever p is a 
prime that divides n 

primality test: an algorithm that determines whether a positive integer is prime. 

prime: a positive integer greater than 1 that has exactly two factors, 1 and itself. 

prime factorization: the factorization of an integer into primes. 

prime number theorem: the theorem that states that the number of primes not 
exceeding a positive real number x is asymptotic to (where logo; denotes the 
natural logarithm of x). 

prime-power factorization: the factorization of an integer into powers of distinct 
primes. 

primitive root of an integer n: an integer r such that the least positive residues 
of the powers of r run through all positive integers relatively prime to n and less 
than n 

probabilistic primality test: an algorithm that determines whether an integer is 
prime with a small probability of a false positive result. 
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pseudoprime to the base b: a composite positive integer n such that 6” = 6 (mod n). 

pseudo-random number generator: a deterministic method to generate numbers 
that share many properties with numbers really chosen randomly. 

Pythagorean triple: positive integers x, y, and z such that x 2 + y 2 = z 2 . 

quadratic field: the set of number Q(y/d) = {a + by/d\a,b integers} where d is a 
square-free integer. 

quadratic irrational: an irrational number that is the root of a quadratic polynomial 
with integer coefficients. 

quadratic nonresidue (of m): an integer that is not a perfect square modulo to. 

quadratic reciprocity : the law that states that given two odd primes p and q, if at 
least one of them is of the form 4n + 1, then p is a quadratic residue of q if and only 
if q is a quadratic residue of p and if both primes are of the form 4 n + 3, then p is a 
quadratic residue of q if and only if q is a quadratic nonresidue of p. 

quadratic residue (of m): an integer that is a perfect square modulo to. 
quadratic sieve: a factoring algorithm invented by Pomerance in 1981. 

rational cuboid problem: the unsolved problem of constructing a right parallelepi- 
ped with height, width, length, face diagonals, and body diagonal all of integer 
length. 

rational number: a real number that is the ratio of two integers. The set of rational 
numbers is denoted by Q. 

reduced system of residues modulo m: pairwise incongruent integers modulo to 
such that each integer in the set is relatively prime to to and every integer relatively 
prime to to is congruent to an integer in the set. 

relatively prime (integers): two integers with no common divisor greater than 1; see 
coprime. 

remainder (of the integer a when divided by the positive integer d): the integer r in 
the equation a = dq + r with 0 < r < d, written r = a mod d. 

root (of a function / modulo to): an integer r such that f(r)= 0 (mod m). 

sieve of Eratosthenes: a procedure for finding all primes less than a specified integer. 

smooth number : an integer all of whose prime divisors are small. 

square root (of a modulo to): an integer r whose square is congruent to a modulo m . 

square-free integer: an integer not divisible by any perfect squares other than 1. 

ten most wanted numbers: the large integers on a list, maintained by a group 
of researchers, whose currently unknown factorizations are actively sought. These 
integers are somewhat beyond the realm of numbers that can be factored using 
known techniques. 

terminating base-b expansion: a base-6 expansion with only a finite number of 
nonzero coefficients. 

totient function: the Euler phi-function. 

transcendental number: a complex number that cannot be expressed as the root of 
an algebraic equation with integer coefficients. 
trial division: a factorization technique that proceeds by dividing an integer by suc- 
cessive primes. 
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twin primes : a pair of primes that differ by two. 

two’s complement expansion: an n bit representation of an integer x, with -2” _1 < 
x < 2 n ~ 1 — 1, for a specified positive integer n, where the leftmost bit is 0 if x > 0 
and 1 if x < 0, and the remaining n — 1 bits are those from the binary expansion 
of x if x > 0 and are those of the binary expansion of 2 n — \x\ if x < 0. 

ultimately periodic: a sequence (typically a base-fc expansion or continued fraction) 
o that eventually repeats, that is, there exist k and N such that a n +k = a n for 
all n > N. 

unit of a quadratic Held: a number e such that e|l in the quadratic field. 

Waring’s problem: the problem of determining the smallest number g{k) such that 
every integer is the sum of g{k) fcth powers of integers. 


4.1 BASIC CONCEPTS 

The basic concepts of number theory include the classification of numbers into different 
sets of special importance, the notion of divisibility, and the representation of integers. 
For more information about these basic concepts, see introductory number theory texts, 
such as [Ro99]. 


4.1.1 NUMBERS 

Definitions: 

The integers are the elements of the set Z = {. . . , —3, —2, —1, 0, 1, 2, 3, . . .}. 

The natural numbers are the integers in the set Af = {0, 1, 2,3,...}. 

The rational numbers are real numbers that can be written as a/b where a and b are 
integers with 6^0. Numbers that are not rational are called irrational. The set of 
rational numbers is denoted by Q. 

The algebraic numbers are real numbers that are solutions of equations of the form 
a n x n + ■ ■ ■ + a\X + ao = 0 where is an integer, for i = 0, 1, . . . , n. Real numbers that 
are not algebraic are called transcendental. 

Facts: 

1. Table 1 summarizes information and notation about some important types of num- 
bers. 

2. A real number is rational if and only if its decimal expansion terminates or is peri- 
odic. (See §4.1.3). 

3. The number TV 1 /" 1 is irrational where N and m are positive integers, unless N is 
the mth power of an integer n. 

4. The number log b a is irrational, where a and b are positive integers greater than 1, 
if there is a prime that divides exactly one of a and b. 
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Table 1 Types of numbers. 


name 

definition 

examples 

natural numbers A f 

{0,1,2,...} 

0, 43 

integers Z 

{...,-2, -1,0, 1,2,...} 

0, 43, -314 

Gaussian integers Z[i] 

{a + 6*| a, b £ Z} 

3, 4 + 3 i, 7 i 

rational numbers Q 

{§ |«,6GZ;6^0} 

0 ^ 

quadratic irrationals 

irrational root of quadratic equation 
CL 2 X 2 + a±x + oo = 0; all at € Q 

y/2, 2+ f 5 

irrational numbers 

1 Z-Q 

\/2, 7T, e 

algebraic numbers Q 

root of algebraic equation 
a n x n + ■ ■ ■ + cio = 0, n > 1, 
ao ,•••,«« G Z 

i, y/2, 

algebraic integers A 

root of monic algebraic equation 

x n + a n - \X n ~ l + h do = 0, 

n > 1, a 0 , ai, . . . , a„_ 1 € Z 

*, V2, 1+ f 5 

transcendental numbers 

C~Q 

7 r, e, i In 2 

real numbers 1Z 

completion of Q 

0, |, ©2, 7T 

complex numbers C 

1 Z or 7Z[i\ 

3 + 2 i, e + in 


5. If x is a root of an equation x m + a m _iX m 1 + ■ ■ ■ + a 0 = 0 where the coefficients a* 
(* = 0, 1, . . . , m — 1) are integers, then x is either an integer or irrational. 

6. The set of algebraic numbers is countable (§1.2.3). Hence, almost all real numbers 
are transcendental. (However, showing a particular number of interest is transcendental 
is usually difficult.) 

7. Both e and 7r are transcendental. The transcendence of e was proven by Hermite in 
1873, and n was proven transcendental by Lindemann in 1882. Proofs of the transcen- 
dence of e and ir can be found in [HaWr89] . 

8. Gelfond-Schneider theorem: If a and j3 are algebraic numbers with a not equal to 0 
or 1 and (3 irrational, then a 13 is transcendental. (For a proof see [Ba90].) 

9. Baker’s linear forms in logarithms: If aq, . . . , a n are nonzero algebraic numbers and 
log ai , . . . , log a n are linearly independent over Q , then 1, log oq, . . . , log a n are linearly 
independent over <2, where Q is the closure of Q. (Consult [Ba90] for a proof and 
applications of this theorem.) 

Examples: 

1. The numbers yy, — —1, |||, and 0 are rational. 

2. The number log 2 10 is irrational. 

3. The numbers \/2, 1 + \/2 , and 1+ are irrational. 
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4 . The number x = 0.10100100010000..., with a decimal expansion consisting of 
blocks where the nth block is a 1 followed by n 0 s, is irrational, since this decimal 
expansion does not terminate and is not periodic. 

5 . The decimal expansion of 22 is periodic, since 22 = 3.142857. However, the decimal 
expansion of n neither terminates, nor is periodic, with tt = 3.141592653589793 .... 

6 . It is not known whether Euler’s constant 7 = lim ( Y © =1 x ~ log n ) (where logo; 
denotes the natural logarithm of x ) is rational or irrational. 

7 . The numbers 2, | , y/l7, yfb, and 1 + y/ 2 are algebraic. 

8 . By the Gelfond-Schneider theorem (Fact 8 ), y/ r 2^ is transcendental. 

9 . By Baker’s linear forms in logarithms theorem (Fact 9), since log 2 10 is irrational, 
it is transcendental. 


4.1.2 DIVISIBILITY 

The notion of the divisibility of one integer by another is the most basic concept in 
number theory. Introductory number theory texts, such as [Ro99], [HaWr89], and 
[NiZuMo91], are good references for this material. 

Definitions: 

If a and d are integers with d > 0, then in the equation a = dq + r where 0 < r < d, a 
is the dividend , d is the divisor, q is the quotient, and r is the remainder. 

Let in and n be integers with m > 1 and n = dm + r with 0 < r < to. Then n mod m, 
the value of the mod m function at n, is r, the remainder when n is divided by to. 

If a and b are integers and a^0, then a divides b, written a\b, if there is an integer c 
such that b = ac. If a divides b, then a is a factor or divisor of b, and b is a multiple 
of a. If a is a positive divisor of b that does not equal b, then a is a proper divisor 
of b. The notation a/ ff means that a does not divide b. 

A prime is a positive integer divisible by exactly two distinct positive integers, 1 and 
itself. A positive integer, other than 1, that is not prime is called composite. 

An integer is square- free if it is not divisible by any perfect square other than 1. 

An integer n is powerful if whenever a prime p divides n, p 2 divides n. 

If p is prime and n is a positive integer, then p r exactly divides n, written p r ||n, if p r 
divides n, but p r+1 does not divide n. 

Facts: 

1. If a is a nonzero integer, then a|0. 

2 . If a is an integer, then l|a. 

3 . If a and b are positive integers and a\b, then the following statements are true: 

• a <b\ 

• - divides 6 ; 

• a k divides b k for every positive integer fc; 

• a divides be for every integer c. 

4 . If a, b, and c are integers such that a\b and b\c, then a|c. 

5 . If a, b, and c are integers such that a\b and a\c, then a| 6 TO + cn for all integers to 
and n. 
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6. If a and b are integers such that a\b and b\a, then a = ±b. 

7 . If a and b are integers and to is a nonzero integer, then a\b if and only if ma|m6. 

8. Division algorithm: If a and d are integers with d positive, then there are unique 
integers q and r such that a = dq + r with 0 < r < d. (Note: The division algorithm 
is not an algorithm, in spite of its name.) 

9 . The quotient q and remainder r when the integer a is divided by the positive integer d 
are given by q = [|J and r = a — d|_5_|, respectively. 

10 . If a and d are positive integers, then there are unique integers q , r, and e such that 
a = dq + er where e = ±1 and — | < r < |. 

11 . There are several divisibility tests that are easily performed using the decimal 
expansion of an integer. These include: 

• An integer is divisible by 2 if and only if its last digit is even. It is divisible by 4 if 

and only if the integer made up of its last two digits is divisible by four. More 
generally, it is divisible by 2 J if and only if the integer made up of the last j 
decimal digits of n is divisible by 2 J . 

• An integer is divisible by 5 if and only if its last digit is divisible by 5 (which 

means it is either 0 or 5). It is divisible by 25 if and only if the integer made 
up of the last two digits is divisible by 25. More generally, it is divisible by 5^ 
if and only if the integer made up of the last j digits of n is divisible by 5° . 

• An integer is divisible by 3, or by 9, if and only if the sum of the decimal digits 

of n is divisible by 3, or by 9, respectively. 

• An integer is divisible by 11 if and only if the integer formed by alternately adding 

and subtracting the decimal digits of the integer is divisible by 11. 

• An integer is divisible by 7, 11, or 13 if and only if the integer formed by succes- 

sively adding and subtracting the three-digit integers formed from successive 
blocks of three decimal digits of the original number, where digits are grouped 
starting with the rightmost digit, is divisible by 7, 11 , or 13, respectively. 

12 . If d\b — 1, then n = (afc...aiao)& (this notation is defined in §4.1.3) is divisible 
by d if and only if the sum of the base b digits of n, Ufc H — • + cq + «o, is divisible by d. 

13 . If d\b+l, then n = (a*, . . . aiao)b is divisible by d if and only if the alternating sum 
of the base b digits of n, (— 1 ) k ak + ■ ■ ■ — Oi + do, is divisible by d. 

14 . If p r 1 1 a and p s | \b where p is a prime and a and b are positive integers, then p r+s \\ab. 

15 . If p r \\a and p s \\b where p is a prime and a and b are positive integers, then 

p min(r, S )|| a + 6 _ 

16 . There are infinitely many primes. (See §4.4.1.) 

17 . There are efficient algorithms that can produce large integers that have an ex- 
tremely high probability of being prime. (See §4.4.4.) 

18 . Fundamental theorem of arithmetic: Every positive integer can be written as the 
product of primes in exactly one way, where the primes occur in nondecreasing order in 
the factorization. 

19 . Many different algorithms have been devised to find the factorization of a positive 
integer into primes. Using some recently invented algorithms and the powerful computer 
systems available today, it is feasible to factor integers with over 100 digits. (See §4.5.1.) 

20 . The relative ease of producing large primes compared with the apparent difficulty 
of factoring large integers is the basis for an important cryptosystem called RSA. (See 
Chapter 14.) 
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Examples: 

1. The integers 0 , 3, —12, 21, 342, and —1113 are divisible by 3; the integers —1, 7, 29, 
and —1111 are not divisible by 3. 

2. The quotient and remainder when 214 is divided by 6 are 35 and 4, respectively 
since 214 = 35 • 6 + 4. 

3. The quotient and remainder when —114 is divided by 7 are —17 and 5, respectively 
since —114 = —17 -7 + 5. 

4. With a = 214 and d = 6, the expansion of Fact 10 is 214 = 36-6 — 2 (so that e = — 1 
and r = 2). 

5. 11 mod 4 = 3, 100 mod 7 = 2, and —22 mod 5 = 3. 

6. The following are primes: 2, 3, 17, 101, 641. The following are composites: 4, 9, 91, 
111 , 1001 . 

7. The integers 15, 105, and 210 are squarefree; the integers 12, 99, and 270 are not. 

8. The integers 72 is powerful since 2 and 3 are the only primes that divide 72 and 
2 2 = 4 and 3 2 = 9 both divide 72, but 180 is not powerful since 5 divides 180, but 5 2 
does not. 

9. The integer 32,688,048 is divisible by 2,4,8, and 16 since 2|8, 4|48, 8|048, and 
16 1 8,048, but it is not divisible by 32 since 32 does not divide 88,048. 

10. The integer 723,160,823 is divisible by 11 since the alternating sum of its digits, 
3 — 2 + 8 — 0 + 6 — 1 + 3 — 2 + 7 = 22, is divisible by 11. 

11. Since 3 3 1216, but 3 4 / ^16, it follows that 3 3 1 1 216. 


4.1.3 RADIX REPRESENTATIONS 

The representation of numbers in different bases has been important in the development 
of mathematics from its earliest days and is extremely important in computer arithmetic. 
For further details on this topic, see [Kn81], [Ko93], and [Sc85]. 

Definitions: 

The base b expansion of a positive integer n, where b is an integer greater than 1, 
is the unique expansion of n as n = cikb k + Ofc_i6 fc_1 + • • • + a±b + ao where k is a 
nonnegative integer, aj is a nonnegative integer less than b for j = 0,1, ... ,k and the 
initial coefficient afc ^ 0. This expansion is written as (afcOfc _ \ . . . aiao)b- 

The integer b in the base b expansion of an integer is called the base or radix of the 
expansion. 

The coefficients aj in the base b expansion of an integer are called the base b digits of 
the expansion. 

Base 10 expansions are called decimal expansions. The digits are called decimal 
digits. 

Base 2 expansions are called binary expansions. The digits are called binary digits 
or bits. 

Base 8 expansions are called octal expansions. 

Base 16 expansions are called hexadecimal expansions. The 16 hexadecimal digits are 
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, F (where A, B, C, D, E, F correspond to the decimal 
numbers 10, 11, 12, 13, 14, 15, respectively). 
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Algorithm 1 : Constructing base b expansions. 

procedure base b expansion(n: positive integer) 
q := n 
k:= 0 

while <7^0 
begin 

Ofc := q mod b 

*■■= L!J 

k := k + 1 

end {the base b expansion of n is (afc-i . . . aiao)b} 


The binary coded decimal expansion of an integer is the bit string formed by replacing 
each digit in the decimal expansion of the integer by the four bit binary expansion of 
that digit. 

The one’s complement expansion of an integer x with |x| < 2 n_1 , for a specified 
positive integer n, uses n bits, where the leftmost bit is 0 if x > 0 and 1 if x < 0, 
and the remaining n— 1 bits are those from the binary expansion of x if x > 0 and 
are the complements of the bits in the binary expansion of |x| if £ < 0 . (Note: the 
one’s complement representation 11 ... 1, consisting of n Is, is usually considered to the 
negative representation of the number 0.) 

The two’s complement expansion of an integer x with — 2 n ~ 1 < x < 2 " _1 — 1 , for a 
specified positive integer n, uses n bits, where the leftmost bit is 0 if x > 0 and 1 if 
x < 0, and the remaining n — 1 bits are those from the binary expansion of x if x > 0 
and are those of the binary expansion of 2" — | x | if x < 0. 

The base b expansion (where b is an integer greater than 1 ) of a real number x with 
0 < x < 1 is the unique expansion of x as x = Y^jL\ W w h ere c j is a nonnegative integer 
less than b for j — 1 , 2 ,... and for every integer N there is a coefficient c n ^ 6—1 for 
some n > N. This expansion is written as (.C1C2C3 . . .)(,. 

A base b expansion (.C1C2C3 . . .)(, terminates if there is a positive integer n such that 

C-n ~ Cn + 1 = C-n+2 = * * * — 0. 

A base 6 expansion (.C1C2C3 . . ,)b is periodic if there are positive integers N and k such 
that c n +fc = c n for all n> N. 

The periodic base b expansion (.C1C2 . . . C/y-iC/v . . . C/v+fc- |C,y . . . c^+k-iCN ■ ■ ■ )b is 
denoted by (.C1C2 . . . cjv-ic/v • • • CN+k-i)b- The part of the periodic base 6 expansion 
preceding the periodic part is the pre-period and the periodic part is the period , 
where the period and pre-period are taken to have minimal possible length. 


Facts: 

1 . If 6 is a positive integer greater than 1 , then every positive integer n has a unique 
base 6 expansion. 

2 . Converting from base 10 to base b: Take the positive integer n and divide it by 6 to 
obtain n = bqo + ao, 0 < ao < 6. Then divide qo by 6 to obtain qo = bqi + a\, 0 < a\ < b. 
Continue this process, successively dividing the quotients by 6, until a quotient of zero 
is obtained, after k steps. The base b expansion of n is then (ofc_i . . . aiao)ft. (See 
Algorithm 1 .) 
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3. Converting from base 2 to base 2 k : Group the bits in the base 2 expansion into 
blocks of k bits, starting from the right, and then convert each block of k bits into a 
base 2 k digit. For example, converting from binary (base 2) to octal (base 8) is done 
by grouping the bits of the binary expansion into blocks of 3 bits starting from the 
right and converting each block into an octal digit. Similarly, converting from binary to 
hexadecimal (base 16) is done by grouping the bits of the binary expansion into blocks 
of 4 bits starting from the right and converting each block into a hex digit. 

4. Converting from base 2 k to binary (base 2): convert each base 2 k digit into a block 
of k bits and string together these bits in the order the original digits appear. For 
example, to convert from hexadecimal to binary, convert each hex digit into the block 
of four bits that represent this hex digit and then string together these blocks of four 
bits in the correct order. 

5. Every positive integer can be expressed uniquely as the sum of distinct powers of 
two. This follows since every positive integer has a unique base two expansion, with the 
digits either 0 or 1. 

6. There are [log;, nj + 1 decimal digits in the base b expansion of the positive integer n. 

7. The number x with one’s complement representation (o„_ia „_2 . . . aiao) can be 
found using the equation 

n— 2 

x= -a„_i(2"- 1 — 1) + £ ai 2\ 
i = 0 

8. The number x with two’s complement representation (a„_ia „_2 . . . a\ao) can be 
found using the equation 

n— 2 

x = — a n _ i • 2 n_1 + X] a{l \ 
i = o 

9. Two’s complement representations of integers are often used by computers because 
addition and subtraction of integers, where these integers may be either positive or 
negative, can be performed easily using these representations. 

10 . Define a function Lgn by the rule 

_ J 1 if n = 0; 

gn ~ \i + Ll°g 2 MJ if n 7^0. 

Then Lgn is the number of bits in the base 2 expansion of n, not counting the sign bit. 
(Compare with Fact 6.) 

11 . The bit operations for the basic operations are given in the following table, adapted 
from [BaSh96]. This table displays the number of bit operations used by the stan- 
dard, naive algorithms, doing things bit by bit (addition with carries, subtraction with 
borrows, standard multiplication by each bit and shifting and adding, and standard 
division), and a big-oh estimate for the number of bits required to do the opera- 
tions using the algorithm with the currently best known computational complexity. 
(The function Lg is defined in Fact 10; the function /i(m. n) is defined by the rule 
n(m,n) = rn(Lgn)(LgLgn) if m > n and p(m,n) = n(Lgm)(LgLgm) otherwise.) 


operation 

number of bits for operation 
(following naive algorithm) 

best known complexity 
(sophisticated algorithm ) 

a±b 

Lga + Lgb 

0(Lga + Lg b) 

a ■ b 

Lga-Lgb 

0(p(Lga,Lg b)) 

a = qb + r 

Lgq-Lgb 

0(p(Lgq,Lgb)) 
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12 . If 6 is a positive integer greater than 1 and a; is a real number with 0 < x < 1, 

then x can be uniquely written as x = 5J where Cj is a nonnegative integer less 

than b for all j, with the restriction that for every positive integer N there is an integer n 
with n > N and c n ^ b — 1 (in other words, it is not the case that from some point on, 
all the coefficients are 6—1). 

13 . A periodic or terminating base 6 expansion, where 6 is a positive integer, represents 
a rational number. 

14 . The base 6 expansion of a rational number, where 6 is a positive integer, either 
terminates or is periodic. 

15 . IfO<a;<l,a;= - where r and s are relatively prime positive integers, and 
s = TU where every prime factor of T divides 6 and gcd(t/, 6) = 1, then the period 
length of the base 6 expansion of x is ord^6 (defined in §4.7.1) and the pre-period 
length is the smallest positive integer N such that T divides b N . 

16 . The period length of the base 6 expansion of ^ (6 and m positive integers greater 
than 1) is m — 1 if and only if m is prime and 6 is a primitive root of m. (See §4.7.1.) 

Examples: 

1. The binary (base 2), octal (base 8), and hexadecimal (base 16) expansions of the 
integer 2001 are (11111010001)2, (3721) s , and (7Dl)i 6 , respectively. The octal and 
hexadecimal expansions can be obtained from the binary expansion by grouping to- 
gether, from the right, the bits of the binary expansion into groups of 3 bits and 4 bits, 
respectively. 

2 . The hexadecimal expansion 2FB3 can be converted to a binary expansion by re- 
placing each hex digit by a block of four bits to give 10111110110011. (The initial two 
0s in the four bit expansion of the initial hex digit 2 are omitted.) 

3 . The binary coded decimal expansion of 729 is 011100101001. 

4 . The nine-bit one’s complement expansions of 214 and —113 (taking n = 9 in the 
definition) are 011010110 and 110001110. 

5 . The nine-bit two’s complement expansions of 214 and —113 (taking n = 9 in the 
definition) are 011010110 and 110001111. 

6. By Fact 7 the integer with a nine-bit one’s complement representation of 101110111 
equals -1(256 - 1) + 119 = -136. 

7. By Fact 8 the integer with a nine-bit two’s complement representation of 101110111 
equals -256 + 119 = -137. 

8. By Fact 15 the pre-period of the decimal expansion of has length 2 and the 
period has length 6 since 28 = 4 • 7 and ordylO = 6. This is verified by noting that 
i = (. 17857142) 10 . 


4.2 GREATEST COMMON DIVISORS 

The concept of the greatest common divisor of two integers plays an important role in 
number theory. The Euclidean algorithm, an algorithm for computing greatest common 
divisors, was known in ancient times and was one of the first algorithms that was studied 
for what is now called its computational complexity. The Euclidean algorithm and its 
extensions are used extensively in number theory and its applications, including those to 
cryptography. For more information about the contents of this section consult [HaWr89], 
[NiZuMo91], or [Ro99]. 
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4.2.1 INTRODUCTION 


Definitions: 

The greatest common divisor of the integers a and b , not both zero, written gcd(a, b), 
is the largest integer that divides both a and b. 

The integers a and b are relatively prime (or coprime) if they have no positive 
divisors in common other than 1, i.e., if gcd(a, b) = 1. 

The greatest common divisor of the integers a^, * = 1, 2, . . . , k, not all zero, written 
gcd(ai, 02 , ... , (ik)) is the largest integer that divides all the integers a*. 

The integers ai, 02 , . . . , a*, are pairwise relatively prime if gcd(aj, cij ) = 1 for i ± j. 
The integers ai, 02 , . . . , a* are mutually relatively prime if gcd(ai, 02 , . . . , a*,) = 1. 

The least common multiple of nonzero integers a and b, written lcm(a, 6), is the 
smallest positive integer that is a multiple of both a and b. 

The least common multiple of nonzero integers ai, . . . , a*,, written lcm(oi, . . . , a*,), 
is the smallest positive integer that is a multiple of all the integers o*, i = 1, 2 , . . . , k. 

The Farey series of order n is the set of fractions | where h and k are integers, 

0 < h < k < n, k 7^ 0, and gcd (h, k) = 1, in ascending order, with 0 and 1 included in 
the forms ° and y, respectively. 

Facts: 

1. If d\a and d\b, then d| gcd(a, b). 

2. If a\m and b\m, then lcm (a, b)\m. 

3. If a is a positive integer, then gcd(0, a) = a. 

4. If a and b are positive integers with a < b, then gcd(a, b) = gcd(6 mod a, a). 

5. If a and b are integers with gcd(a, b) = d , then gcd(|, |) = 1. 

6. If a, b , and c are integers, then gcd(a + cb, b) = gcd(a, b). 

7. If a, 6, and c are integers with not both a and b zero and c ^ 0, then gcd(ac, be) = 

|c| gcd(a, b). 

8. If a and b are integers with gcd(a,6) = 1, then gcd(a + b, a — b) = 1 or 2. (This 
greatest common divisor is 2 when both a and b are odd.) 

9. If a, b, and c are integers with gcd(a, b) = gcd(a, c) = 1, then gcd(a, be) = 1. 

10. If a, b , and c are mutually relatively prime nonzero integers, then gcd(a, be) = 
gcd(a, b) • gcd (a, c). 

11. If a and b are integers, not both zero, then gcd(a, b) is the least positive integer of 
the form ma + nb where rn and n are integers. 

12. The probability that two randomly selected integers are relatively prime is 
More precisely, if R(n) equals the number of pairs of integers a, b with 1 < a < n, 

1 < b < n, and gcd(a, b) = 1, then = 4* + O(^p). 

13. If a and b are positive integers, then gcd(2 a — 1, 2 b — 1) = 2^ a,b ^ — 1. 

14. If a, &, and c are integers and a|6c and gcd(a, b) = 1, then a|c. 

15. If a, &, and c are integers, a\ c, b\c and gcd(a, b) = 1, then ab\c. 

16. If a\, 02, • • • , a,k are integers, not all zero, then gcd(ai, . . . , a*,) is the least positive 
integer that is a linear combination with integer coefficients of ai, . . . , a*,. 
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17. If ai, 02 , . . . , a* are integers, not all zero, and d\a,i for * = 1, 2, . . . , k, then 
d|gcd(oi,o 2 ,...,Ofc). 

18. If oi, . . . , a n are integers, not all zero, then the greatest common divisor of these n 
integers is the same as the greatest common divisor of the set of n — 1 integers made 
up of the first n — 2 integers and the greatest common divisor of the last two. That is, 
gcd(oi, . . . , a n ) = gcd(oi, . . . , a n _ 2 , gcd(a n _i, a n )). 

19. If a and b are nonzero integers and m is a positive integer, then lcm (ma, mb) = 
m ■ lcm(a, b) 

20. If b is a common multiple of the integers Oi, a 2 , . . . , a k , then b is a multiple of 
lcm(cti , . . . ,a k ). 

21. The common multiples of the integers Oi, . . . , a k are the integers 0, lcm(ai, . . . , a k ), 

2 • lcm(oi, . . . , a*,), 

22. If oi,a 2 , . . . ,a n are pairwise relatively prime integers, then lcm(ai, . . . , a n ) = 

aia 2 . . . a n . 

23. If ai, a 2 , . . . ,a n are integers, not all zero, then lcm(ai, a 2 , ■ ■ . , a„_i, a n ) = 
lcm(lcm(ai , a 2 , . . . , a„_i),a n ). 

24. If a = Pi ai P 2 a2 ■ ■ 'Pn an and b = pi bl p 2 ° 2 • • ■Pn bn , where the p, are distinct primes 
for i = 1 , ,n, and each exponent is a nonnegative integer, then 

gcd(a, b) = p i min(a 1 ,6 1 ) p2 min(a 2 ,6 2 ) _ _ _ ^min(o n ,6„) , 

where min©, y) denotes the minimum of x and y, and 

lcm (a, b) = p 1 ma X (o 1 ,6 1 ) p2 max(o 2 ,6 3 ) _ _ ^ma x(a„,b„) ; 


where max(i, y) denotes the maximum of x and y. 

25. If a and b are positive integers, then ab = gcd(a, b) ■ lcm(a, b). 

... . . abc ■ gcd(a, 6, c) 

26. If a, o, and c are positive integers, then fcm(a, o, c) = 


gcd(a, b) ■ gcd(a, c) • gcd(6, c) ' 

27. If a, b , and c are positive integers, then gcd(lcm(a, b), lcm(a, c)) = lcm(a, gcd(6, c)) 
and lcm(gcd(a, 6),gcd(a, c)) = gcd(a,lcm(6, c)). 

28. If |, and j are successive terms of a Farey series, then | 

29. If | and % are successive terms of a Farey series, then ad — be = —1. 

30. If | and % are successive terms of a Farey series of order n, then b + d > n. 

31. Farey series are named after an English geologist who published a note describing 
their properties in the Philosophical Magazine in 1816. The eminent French mathemati- 
cian Cauchy supplied proofs of the properties stated, but not proved, by Farey. Also, 
according to [Di71], these properties had been stated and proved by Haros in 1802. 


Examples: 

1. gcd(12, 15) = 3, gcd(14, 25) = 1, gcd(0, 100) = 100, and gcd(3, 39) = 3. 

2. gcd(2 7 3 3 5 4 7 2 ll 3 17 3 , 2 4 3 5 5 2 7 2 11 2 13 3 ) = 2 4 3 3 5 2 7 2 11 2 . 

3. lcm(2 7 3 3 5 4 7 2 ll 3 17 3 , 2 4 3 5 5 2 7 2 11 2 13 3 ) = 2 7 3 5 5 4 7 2 11 3 13 3 17 3 . 

4. gcd(18, 24, 36) = 6 and gcd(10, 25, 35, 245) = 5. 

5. The integers 15, 21, and 35 are mutually relatively prime since gcd(15, 21, 35) = 1. 
However, they are not pairwise relatively prime since gcd(15, 35) = 5. 

6. The integers 6, 35, and 143 are both mutually relatively prime and pairwise relatively 
prime. 

7. The Farey series of order 5 is f , |, §, |, |, |, \ 
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4.2.2 THE EUCLIDEAN ALGORITHM 


Finding the greatest common divisor of two integers is one of the most common problems 
in number theory and its applications. An algorithm for this task was known in ancient 
times by Euclid. His algorithm and its extensions are among the most commonly used 
algorithms. For more information about these algorithms see [BaSh96] or [Kn81]. 

Definition: 

The Euclidean algorithm is an algorithm that computes the greatest common divisor 
of two integers a and b with a < b, by replacing them with a and b mod a, and repeating 
this step until one of the integers reached is zero. 

Facts: 

1. The Euclidean algorithm : The greatest common divisor of two positive integers can 
be computed using the recurrence in §4.2.1 Fact 4, together with §4.1.2 Fact 3. The 
resulting algorithm proceeds by successively replacing a pair of positive integers with 
a new pair of integers formed from the smaller of the two integers and the remainder 
when the larger is divided by the smaller, stopping once a zero remainder is reached. 
The last nonzero remainder is the greatest common divisor of the original two integers. 
(See Algorithm 1.) 


Algorithm 1 : The Euclidean algorithm. 

procedure gcd(a, b: positive integers) 
r 0 := a 
r\ := b 
i 1 

while Tj 0 
begin 

ri + 1 := r»_i mod r % 

i := i + 1 

end {gcd(a, b) is r,_i} 


2 . Lame’s theorem: The number of divisions needed to find the greatest common 
divisor of two positive integers using the Euclidean algorithm does not exceed five times 
the number of decimal digits in the smaller of the two integers. (This was proved by 
Gabriel Lame (1795-1870)). (See [BaSh96] or [Ro99] for a proof.) 

3 . The Euclidean algorithm finds the greatest common divisor of the Fibonacci numbers 
(§3.1.2) F n+ 1 and F n+2 (where n is a positive integer) using exactly n division steps. 
If the Euclidean algorithm uses exactly n division steps to find the greatest common 
divisor of the positive integers a and b (with a < b), then a > F n+ 1 and b > F n+ 2 . 

4 . The Euclidean algorithm uses 0((log 6) 3 ) bit operations to find the greatest common 
divisor of two integers a and b with a < b. 

5 . The Euclidean algorithm uses 0( Lg a ■ Lg 6) bit operations to find the greatest com- 
mon divisor of two integers a and b. 

6. Least remainder Euclidean algorithm: The greatest common divisor of two inte- 
gers a and b (with a < b) can be found by replacing a and b with a and the least 
remainder of b when divided by a. (The least remainder of b when divided by a is 
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Algorithm 2: The extended Euclidean algorithm. 

procedure gcdex(a , b : positive integers) 
r 0 := a 
ri := b 
mo := 1 

TOi = 0 

n 0 := 0 

ni := 1 

i := 1 

while j*j ^ 0 
begin 

f»+i := n_i mod r'i 

m i+ i := TOj_ i - 

n»+i := n»_i - 
i := i + 1 

end {gcd(a, 6) is rj_i and gcd(a, b) = rrii-ia + n*_i6} 


the integer of smallest absolute value congruent to b modulo a. It equals b mod a if 
b mod a < §, and (6 mod a) — a if b mod a > |)). Repeating this procedure until a 
remainder of zero is reached produces the great common divisor of a and b as the last 
nonzero remainder. 

7. The number of divisions used by the least remainder Euclidean algorithm to find 
the greatest common divisor of two integers is less than or equal the number of divisions 
used by the Euclidean algorithm to find this greatest common divisor. 

8. Binary greatest common divisor algorithm : The greatest common divisor of two 
integers a and b can also be found using an algorithm known as the binary greatest 
common divisor algorithm. It is based on the following reductions: if a and b are both 
even, then gcd(a, b) = 2gcd(|, |); if a is even and b is odd, then gcd(a, b) = gcd(|,6) 
(and if a is odd and b is even, switch them); and if a and b are both odd, then gcd(a, b) = 
gcd( - a ~ b ' , b). To stop, the algorithm uses the rule that gcd(a, a) = a. 

9. Extended Euclidean algorithm: The extended euclidean algorithm finds gcd(a, b ) 
and expresses it in the form gcd(a, b) = ma+nb for some integers to and n. The two-pass 
version proceeds by first working through the steps of the Euclidean algorithm to find 
gcd(a, b), and then working backwards through the steps to express gcd(a, b) as a linear 
combination of each pair of successive remainders until the original integers a and b 
are reached. The one-pass version of this algorithm keeps track of how each successive 
remainder can be expressed as a linear combination of successive remainders. When the 
last step is reached both gcd(a, b) and integers to and n with gcd(a, b) = ma + nb are 
produced. The one-pass version is displayed as Algorithm 2. 

Examples: 

1. When the Euclidean algorithm is used to find gcd(53, 77), the following steps result: 
77 = 1 • 53 + 24, 

53 = 2 • 24 + 5, 

24 = 4 • 5 + 4, 

5 = 1-4+1, 

4 = 4-1. 
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This shows that gcd(53,77) = 1. Working backwards through these steps to perform 
the two-pass version of the Euclidean algorithm gives 
1 = 5-1-4 

= 5 - 1 • (24 - 4 • 5) = 5-5-1-24 
= 5 • (53 - 2 • 24) - 1 • 24 = 5-53 - 11-24 
= 5 • 53 - 11 • (77 - 1 • 53) = 16 • 53 - 11 • 77. 

2 . The steps of the least-remainder algorithm when used to compute gcd(57, 93) are 

gcd(57, 93) = gcd(57, 21) = gcd(21, 6) = gcd(6, 3) = 3. 

3 . The steps of the binary GCD algorithm when used to compute gcd(108, 194) are 

gcd(108, 194) = 2 • gcd(54, 97) = 2 • gcd(27, 97) = 2 • gcd(27, 35) 

= 2 • gcd(4, 35) = 2 • gcd(2, 35) = 2 • gcd(l, 35) = 2. 


4.3 CONGRUENCES 


4.3.1 INTRODUCTION 

Definitions: 

If to is a positive integer and a and b are integers, then a is congruent to b modulo rn. 
written a = b (mod to), if to divides a — b. If to does not divide a — b, a and b are 
incongruent modulo to, written a /i (mod to). 

A complete system of residues modulo m is a set of integers such that every integer 
is congruent modulo to to exactly one of the integers in the set. 

If to is a positive integer and a is an integer with a = bm + r, where 0 < r < to — 1, 
then r is the least nonnegative residue of a modulo m. When a is not divisible 
by to, r is the least positive residue of a modulo m. 

The congruence class of a modulo m is the set of integers congruent to a modulo to 
and is written [a] m . Any integer in [a] m is called a representative of this class. 

If in is a positive integer and a is an integer relatively prime to in, then a is an in- 
verse of a modulo m if aa = 1 (mod to). An inverse of a modulo to is also written 
a -1 mod to. 

If to is a positive integer, then a reduced residue system modulo m is a set of 
integers such that every integer relatively prime to to is congruent modulo to. to exactly 
one integer in the set. 

If to is a positive integer, the set of congruence classes modulo to. is written Z m . (See 
§5.2.1.) 

If in is a positive integer greater than 1, the set of congruence classes of elements 
relatively prime to in is written Z^-, that is, Z* 2 = { [a\ m £ Z m | gcd(a,n) = 1 }. (See 
§5.2.1.) 
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Facts: 

1. If m is a positive integer and a, 6, and c are integers, then: 

• a = a (mod m); 

• a = b (mod to) if and only if b = a (mod to); 

• if a = b (mod m) and b = c (mod m), then a = c (mod in). 

Consequently, congruence modulo in is an equivalence relation. (See §1.4.2 and §5.2.1.) 

2. If m is a positive integer and a is an integer, then in divides a if and only if a = 
0 (mod to). 

3. If to is a positive integer and a and b are integers with a = b (mod to), then 
gcd(a,?n) = gcd(6, to). 

4. If a, b, c, and to are integers with in positive and a = b (mod to), then a + c = 

b + c (mod to), a — c = b — c (mod to), and ac = be (mod to). 

5. If in is a positive integer and a, 6, c, and d are integers with a = b (mod to) and 

c = d (mod to), then ac = bd (mod to). 

6. If a, &, c, and to. are integers, to is positive, d = ged (c, to), and ac = 6c(mod?n), 
then a = b (mod (j). 

7. If a, 6, c, and to. are integers, to. is positive, and c and ?n are relatively prime, and 
ac = be (mod to), then a = 5 (mod to). 

8. If a, b, k and to are integers with k and to positive and a = b (mod to), then 
a k = b k (mod to). 

9. If a, 6, and to are integers with a = b (mod to), then if c is an integer, it does not 
necessarily follow that c a = c b (mod to). 

10. If f(x \, . . . , x n ) is a polynomial with integer coefficients and ai . . . a n , b\,...,b n 
are integers with a.j = bi (mod to) for all i, then . . . , a n ) = f(bi , . . . , b n ) (mod to). 

11. If a, 6, and to* are integers with rrii positive and a = b (mod to,) for * = 1, 2, . . . , fc, 
then a = b (mod lcm(mi, TO 2 , . . . , TOfc)). 

12. If a and b are integers, TOi (i = 1,2,..., /c) are pairwise relatively prime positive 
integers, and a = b (mod to,) for i = 1, 2, . . . , fc, then a = b (mod TO 1 TO 2 . . . To/,). 

13. The congruence class [a] m is the set of integers {a, a ± to, a ± 2 to, . . .}. If a = 
6 (mod to), then [a] m = [b\ m . The congruence classes modulo to are the equivalence 
classes of the congruence modulo to equivalence relation. (See §5.2.1.) 

14. Addition, subtraction, and multiplication of congruence classes modulo m, where 
m is a positive integer, are defined by [a] m + [b] m = [a + b] m , [a} m - [b\ m = [a - b\ m , 
and [a] TO [b] rn = [ab] rn . Each of these operations is well defined, in the sense that using 
representatives of the congruence classes other than a and b does not change the resulting 
congruence class. 

15. If to is a positive integer, then (Z n ,+), where + is the operation of addition of 
congruence classes defined in Fact 14 and in §5.2.1, is an abelian group. The identity 
element in this group is [0] m and the inverse of [a] m is [— a\ m = [to — a\ m . 

16. If to is a positive integer greater than 1 and a is relatively prime to to, then a has 
an inverse modulo m. 

17. An inverse of a modulo to, where to is a positive integer and gcd(a, to) = 1, may 
be found by using the extended Euclidean algorithm to find integers x and y such that 
ax + my = 1 , which implies that x is an inverse of a modulo to . 
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18. If to is a positive integer, then ( Z * n , • ) , where • is the multiplication operation on 
congruence classes, is an abelian group. (See §5.2.1.) The identity element of this group 
is [l] m and the inverse of the class [a] m is the class [a] m , where a is an inverse of a 
modulo m. 

19. If a,; (i = 1, . . . , to) is a complete residue system modulo to, where to is a positive 
integer, and r and s are integers with gcd(TO,r) = 1, then rat + s is a complete system 
of residues modulo to. 

20. If a and b are integers and in is a positive integer with 0 < a < m and 0 < b < m, 
then (a + b) mod in = a + b if a + b < to, and (a + b ) mod to = a + b — m if a + b > to. 

21 . Computing the least positive residue modulo to of powers of integers is important in 
cryptology (see Chapter 14). An efficient algorithm for computing b n mod to where n 
is a positive integer with binary expansion n = (ak-i ■ ■ ■ 0100)2 is to find the least 
positive residues of b, b 2 , b i , . . . , b 2 modulo to by successively squaring and reducing 
modulo to, multiplying together the least positive residues modulo to of b 23 for those j 
with a.j = 1, reducing modulo to after each multiplication. 

22. Wilson’s theorem: If p is prime, then ( p — 1)! = —1 (mod p). 

23. If n is a positive integer greater than 1 such that (n — 1)! = —1 (mod n) then n is 
prime. 

24. Fermat’s little theorem: If p is a prime and a is an integer not divisible by p then 
a p ~ 1 = 1 (modp). 

25. Euler’s theorem: If to is a positive integer and a is an integer relatively prime to to, 

then = 1 (mod to), where 4>(ni) is the number of positive integers not exceeding to 

that are relatively prime to m. 

26. If a is an integer and p is a prime that does not divide a, then from Fermat’s little 
theorem it follows that a p ~ 2 is an inverse of a modulo p. 

27. If a and to are relatively prime integers with to > 1, then a^ m ^ 1 is an inverse 
of a modulo in. This follows directly from Euler’s theorem. 

28. Linear congruential method: One of the most common method used for generating 
pseudo-random numbers is the linear congruential method. It starts with integers m, 
a, c, and xq where 2 < a < to, 0 < c < in, and 0 < xo < to. The sequence of 
pseudo-random numbers is defined recursively by 

x n +i = ( ax n + c) mod to, n = 0, 1, 2, 3, 

Here to is the modulus , a is the multiplier, c is the increment, and Xo is the seed of the 
generator. 

29. Big-oh estimates for the number of bit operations required to do modular ad- 
dition, modular subtraction, modular multiplication, modular inversion, and modular 
exponentiation is summarized in the following table. 


name 

operation 

number of 
bit operations 

modular addition 

modular subtraction 

modular multiplication 

modular inversion 

modular exponentiation 

( a + b) mod in 
( a — b) mod in 
(a ■ b) mod to 
( a” 1 ) mod in 
a k mod to, k < to 

O (log to) 

O (log to) 
0((logm) 2 ) 
0((log to) 2 ) 
0((logm) 3 ) 
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Examples: 

1. 23 = 5 (mod 9), — 17 = 13 (mod 15), and 99 = 0 (mod 11), but 11 (mod 5), 
—3 (mod 6), and 44 /4=J (mod 7). 

2. To find an inverse of 53 modulo 71, use the extended Euclidean algorithm to obtain 
16 • 53 — 11 • 71 = 1 (see Example 1 of 4.2.2). This implies that 16 is an inverse of 53 
modulo 71. 

3. Since 11 is prime, by Wilson’s theorem it follows that 10! = —1 (mod 11). 

4. 5! = 0 (mod 6), which provides an impractical verification that 6 is not prime. 

5. To find the least positive residue of 3 201 modulo 11, note that by Fermat’s little 
theorem 3 10 = 1 (mod 11). Hence 3 201 = (3 10 ) 20 -3 = 3 (mod 11). 

6. Zeller’s congruence: A congruence can be used to determine the day of the week 
of any date in the Gregorian calendar, the calendar used in most of the world. Let w 
represent the day of the week, with w = 0,1, 2, 3, 4, 5, 6 for Sunday, Monday, Tuesday, 
Wednesday, Thursday, Friday, Saturday, respectively. Let k represent the day of the 
month. Let to represent the month with m = 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 for January, 
February, March, April, May, June, July, August, September, October, November, De- 
cember, respectively. Let N represent the previous year if the month is January or 
February or the current year otherwise, with C the century of N and Y the particular 
year of the century of N so that N = 100 Y + C. Then the day of the week can be found 
using the congruence 

w = k + [2.6m — 0.2J — 2 C + Y + J (mod 7) . 

7. January 1, 1900 was a Monday. This follows by Zeller’s congruence with C = 18, 
Y = 99, to = 11, and k = 1, noting that to apply this congruence January is considered 
the eleventh month of the preceding year. 


4.3.2 LINEAR AND POLYNOMIAL CONGRUENCES 


Definitions: 

A linear congruence in one variable is a congruence of the form ax = b (mod m), 
where a, b , and m are integers, to is positive, and x is an unknown. 

If / is a polynomial with integer coefficients, an integer r is a solution of the congruence 
f(x) = 0 (mod to), or a root of f(x) modulo to, if f(r) = 0 (mod to). 

Facts: 

1. If a, b, and to are integers, m is positive, and gcd(a,?n) = d, then the congruence 
ax = b (mod m) has exactly d incongruent solutions modulo in if d\b, and no solutions 
if d/ tf. 

2. If a, b, and m are integers, to is positive, and gcd(a,m.) = 1, then the solutions of 
ax = b (mod to) are all integers x with x = ab (mod to). 

3. If a and b are positive integers and p is a prime that does not divide a, then the 
solutions of ax = b (mod p) are the integers x with x = a p ~ 2 b (mod p). 

4. Time’s lemma : If p is a prime and a is an integer not divisible by p , then the 
congruence ax = y (mod p) has a solution xq, yo with 0 < |xq| < y/p, 0 < | j/o I < y/P- 
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5. Chinese remainder theorem: If rrii, i = 1,2, ... ,r, are pairwise relatively prime 
positive integers, then the system of simultaneous congruences x = cii (mod mi), i = 
1,2, ... ,r, has a unique solution modulo M = m\m 2 ...m r which is given by x = 
aiMiyi+a 2 M 2 y 2 + - ■ ■+a r M r y r where M \ = — and y k is an inverse of M \ modulo m^, 
k = 1, 2, . . . ,r. 

6. Problems involving the solution of asystem of simultaneous congruences arose in the 
writing of ancient mathematicians, including the Chinese mathematician Sun-Tsu, and 
in other works by Indian and Greek mathematicians. (See [Di71] for details.) 

7. The system of simultaneous congruences x = Uj (mod mi), i = 1,2, ...,r has a 
solution if and only if gcd (mt,mj) divides a,; — a> 3 for all pairs of integers (i,j) with 
1 < i < j < r. If a solution exists, it is unique modulo lcm(rni, m 2 , . . . , m r ). 

8. If a, b, c, d, e, f, and m are integers with in positive such that gcd(ad — be, m) = 1, 
then the system of congruences ax + by = e (mod m), cx + dy = / (mod m) has a 
unique solution given by x = g(de — bf) (mod m), y = g(af — ce) (mod m) where g is 
an inverse of ad — be modulo m. 

9. Lagrange’s theorem: If p is prime, then the polynomial f(x) = a n x n + • • • + a±x + ao 
where a n / 41 (mod p) has at most n roots modulo p. 

10. If f(x) = a n x n + • • • + a\X+ ao, where a* (i = 1 , . . . , n) is an integer and p is prime, 
has more than n roots modulo p, then p divides at for all i = 1, . . . , n. 

11. If mi, m 2 , ■ ■ ■ , m r are pairwise relatively prime positive integers with product m = 
m\m 2 ■ ■ ■ m r , and / is a polynomial with integer coefficients, then f(x) has a root 
modulo m if and only if f(x) has a root modulo to*, for all i = 1, 2, . . . , r. Furthermore, 
if f(x) has ni incongruent roots modulo rrij and n incongruent roots modulo m, then 
n = n\n 2 ■ ■ ■ n r . 

12. If p is prime, k is a positive integer, and s is a root of f{x) modulo p k , then: 

• if p //[ (s), then there is a unique root t of f(x) modulo p k+1 with t = s (mod p k ), 

namely t = s+p k u where u is the unique solution of f'(s)u = —f(s) /p k (mod p)\ 

• iip\f'(s) andp fe+1 |/(s), then there are exactly p incongruent roots of f(x) modulo 

p k+1 congruent to s modulo p, given by s + p k i, i = 0,1, . . . ,p — 1; 

• if p\f'(s) and p k+1 / f(s), then there are no roots of f(x) modulo p k+1 that are 

congruent to s modulo p k . 

13. Finding roots of a polynomial modulo m, where m is a positive integer: First find 
roots of the polynomial modulo p r for each prime power in the prime-power factorization 
of m (Fact 14) and then use the Chinese remainder theorem (Fact 5) to find solutions 
modulo m. 

14. Finding solutions modulo p r reduces to first finding solutions modulo p. In par- 
ticular, if there are no roots of f(x) modulo p, there are no roots of f(x) modulo p r . 
If f(x) has roots modulo p, choose one, say r with 0 < r < p. By Fact 12, corresponding 
to r there are 0, 1, or p roots of f(x) modulo p 2 . 

Examples: 

1. There are 3 incongruent solutions of 6a: = 9 (mod 15) since gcd(6, 15) = 3 and 3|9. 
The solutions are those integers x with x = 4,9, or 14 (mod 15). 

2. The linear congruence 2x = 7 (mod 6) has no solutions since gcd(2, 6) = 2 and 2/ If. 

3. The solutions of the linear congruence 3a; = 5 (mod 11) are those integers x with 
a; = 3-5 = 4-5 = 9 (mod 11). 
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4 . It follows from the Chinese remainder theorem (Fact 5) that the solutions of the 
systems of simultaneous congruences x = 1 (mod 3), x = 2 (mod 4), and x = 3 (mod 5) 
are all integers x with x = 1 ■ 20 - 2 + 2- 15 - 3 + 3- 12 - 3 = 58 (mod 60). 

5 . The simultaneous congruences x = 4 (mod 9) and x = 7 (mod 15) can be solved 
by noting that the first congruence implies that x — 4 = 9f for some integer f, so 
that x = 9t + 4. Inserting this expression for x into the second congruence gives 
9t + 4 = 7 (mod 15). This implies that 3f = 1 (mod 5), so that t = 2 (mod 5) and 
t = 5u + 2 for some integer u. Hence x = 45 u + 22 for some integer u. The solutions of 
the two simultaneous congruences are those integers x with x = 22 (mod 45). 


4.4 PRIME NUMBERS 

One of the most powerful tools in number theory is the fact that each composite integer 
can be decomposed into a product of primes. Primes may be thought of as the building 
blocks of the integers in the sense that they can be decomposed only in trivial ways, for 
example, 3 = 1x3. Prime numbers, once of only theoretical interest, now are important 
in many applications, especially in the area of cryptography were large primes play a 
crucial role in the area of public-key cryptosystems (see Chapter 14). From ancient to 
modern times, mathematicians have devoted long hours to the study of primes and their 
properties. Even so, many questions about primes have only partially been answered 
or remain complete mysteries, including questions that ask whether there are infinitely 
many primes of certain forms. There have been many recent discoveries concerning 
prime numbers, such as the discovery of new Mersenne primes. The current state of 
knowledge on some of these questions and the latest discoveries are described in this 
section. 

Additional information about primes can be found in [CrPo99] and [Ri96] and on 
the Web. See the Prime Pages at the website 

http: //www.utm. edu/research/primes/ index. html#lists 


4.4.1 BASIC CONCEPTS 

Definitions: 

A prime is a natural number greater than 1 that is exactly divisible only by 1 and 
itself. 

A composite is a natural number greater than 1 that is not a prime. That is, a 
composite may be factored into the product of two natural numbers both smaller than 
itself. 

Facts: 

1. The number 1 is not considered to be prime. 

2. Table 1 lists the primes up to 10,000. 

3. Fundamental theorem of arithmetic: Every natural number greater than 1 is either 
prime or can be written as a product of prime factors in a unique way, up to the 
order of the prime factors. That is, every composite n can be expressed uniquely as 
n = P 1 P 2 ■ ■ - Pki where p\ < P 2 < • • • < pk are primes. This is sometimes also known as 
the unique factorization theorem. 


© 2000 by CRC Press LLC 


4 . The unique factorization of a positive integer n formed by grouping together equal 
prime factors produces the unique prime-power factorization n = p^p^ 2 ■ ■ -Pk k ■ 

5 . Table 2 lists the prime-power factorization of all positive integers below 2,500. Num- 
bers appearing in boldface are prime. 

Examples: 

1. 6 = 2x3. 

2 . 245 = 5 x 7 2 . 

3 . 10! = 2 8 x 3 4 x 5 2 x 7. 

4 . 68,718,821,377 = (2 17 — 1) • (2 19 — 1) (both factors are Mersenne primes; see §4.4.3). 

5 . The largest prime known is 2 3 ’ 021,377 — 1. It has 909,526 decimal digits and was 
discovered in 1998. It is a Mersenne prime (see Table 3). 


4.4.2 COUNTING PRIMES 

Definitions: 

The value of the prime counting function ir(x ) at x where x is a positive real number 
equals the number of primes less than or equal to x. 

The li function is defined by li ( x ) = f 2 for x >2. (The principal value is taken 
for the integral at the singularity t = 1.) 

Twin primes are primes that differ by exactly 2. 

Facts: 

1. Euclid (ca. 300 B.C.E.) proved that there are infinitely many primes. He observed 
that the product of a finite list of primes, plus one, must be divisible by a prime not on 
that list. 

2 . Leonhard Euler (1707-1783) showed that the sum of the reciprocals of the primes 
up to n tends toward infinity as n tends toward infinity, which also implies that there 
are infinitely many primes. (There are many other proofs as well.) 

3 . There is no useful, exact formula known which will produce the nth prime, given n. 
It is relatively easy to construct a useless (that is, impractical) one. For example, let a = 
X)~=i Pn/2 2 ” , where p n is the nth prime. Then the nth prime is L2 2 "q!J— 2 2 " 1 L2 2 ™ 1 a\, 
where [xj is the greatest integer less than or equal to x. 

4 . If f(x) is a polynomial with integer coefficients that is not constant, then there are 
infinitely many integers n for which |/(n)| is not prime. 

5 . There are polynomials with integer coefficients with the property that the set of 
positive values taken by each of these polynomials as the variables range over the set 
of nonnegative integers is the set of prime numbers. The existence of such polynomials 
has essentially no practical value for constructing primes. For example, there are poly- 
nomials in 26 variables of degree 25, in 42 variables of degree 5, and in 12 variables of 
degree 13697, with this property. (See [Ri96].) 

6 . — p 1 > 1 as n — > oo. (This follows from the prime number theorem, Fact 10.) 

7 . An inexact and rough formula for the nth prime is nlogn. 

8. p n > nlogn for all n. (J. B. Rosser) 
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Table 1 Table of primes less than 10,000. 


The prime number pion+k is found by looking at the row beginning with n.. and at the 
column beginning with ..k. 



..0 

..1 

..2 

..3 

..4 

..5 

..6 

..7 

..8 

..9 



2 

3 

5 

7 

11 

13 

17 

19 

23 

1.. 

29 

31 

37 

41 

43 

47 

53 

59 

61 

67 

2.. 

71 

73 

79 

83 

89 

97 

101 

103 

107 

109 

3.. 

113 

127 

131 

137 

139 

149 

151 

157 

163 

167 

4.. 

173 

179 

181 

191 

193 

197 

199 

211 

223 

227 

5.. 

229 

233 

239 

241 

251 

257 

263 

269 

271 

277 

6 .. 

281 

283 

293 

307 

311 

313 

317 

331 

337 

347 

7.. 

349 

353 

359 

367 

373 

379 

383 

389 

397 

401 

8.. 

409 

419 

421 

431 

433 

439 

443 

449 

457 

461 

9.. 

463 

467 

479 

487 

491 

499 

503 

509 

521 

523 

10.. 

541 

547 

557 

563 

569 

571 

577 

587 

593 

599 

11.. 

601 

607 

613 

617 

619 

631 

641 

643 

647 

653 

12.. 

659 

661 

673 

677 

683 

691 

701 

709 

719 

727 

13.. 

733 

739 

743 

751 

757 

761 

769 

773 

787 

797 

14.. 

809 

811 

821 

823 

827 

829 

839 

853 

857 

859 

is- 

863 

877 

881 

883 

887 

907 

911 

919 

929 

937 

le.. 

941 

947 

953 

967 

971 

977 

983 

991 

997 

1009 

17.. 

1013 

1019 

1021 

1031 

1033 

1039 

1049 

1051 

1061 

1063 

18.. 

1069 

1087 

1091 

1093 

1097 

1103 

1109 

1117 

1123 

1129 

19.. 

1151 

1153 

1163 

1171 

1181 

1187 

1193 

1201 

1213 

1217 

20.. 

1223 

1229 

1231 

1237 

1249 

1259 

1277 

1279 

1283 

1289 

21.. 

1291 

1297 

1301 

1303 

1307 

1319 

1321 

1327 

1361 

1367 

22.. 

1373 

1381 

1399 

1409 

1423 

1427 

1429 

1433 

1439 

1447 

23.. 

1451 

1453 

1459 

1471 

1481 

1483 

1487 

1489 

1493 

1499 

24.. 

1511 

1523 

1531 

1543 

1549 

1553 

1559 

1567 

1571 

1579 

25.. 

1583 

1597 

1601 

1607 

1609 

1613 

1619 

1621 

1627 

1637 

26.. 

1657 

1663 

1667 

1669 

1693 

1697 

1699 

1709 

1721 

1723 

27.. 

1733 

1741 

1747 

1753 

1759 

1777 

1783 

1787 

1789 

1801 

28.. 

1811 

1823 

1831 

1847 

1861 

1867 

1871 

1873 

1877 

1879 

29.. 

1889 

1901 

1907 

1913 

1931 

1933 

1949 

1951 

1973 

1979 

30.. 

1987 

1993 

1997 

1999 

2003 

2011 

2017 

2027 

2029 

2039 

31.. 

2053 

2063 

2069 

2081 

2083 

2087 

2089 

2099 

2111 

2113 

32.. 

2129 

2131 

2137 

2141 

2143 

2153 

2161 

2179 

2203 

2207 

33.. 

2213 

2221 

2237 

2239 

2243 

2251 

2267 

2269 

2273 

2281 

34.. 

2287 

2293 

2297 

2309 

2311 

2333 

2339 

2341 

2347 

2351 

35.. 

2357 

2371 

2377 

2381 

2383 

2389 

2393 

2399 

2411 

2417 

36.. 

2423 

2437 

2441 

2447 

2459 

2467 

2473 

2477 

2503 

2521 

37.. 

2531 

2539 

2543 

2549 

2551 

2557 

2579 

2591 

2593 

2609 

38.. 

2617 

2621 

2633 

2647 

2657 

2659 

2663 

2671 

2677 

2683 

39.. 

2687 

2689 

2693 

2699 

2707 

2711 

2713 

2719 

2729 

2731 

40.. 

2741 

2749 

2753 

2767 

2777 

2789 

2791 

2797 

2801 

2803 
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..0 

..1 

..2 

..3 

..4 

..5 

..6 

..7 

..8 

..9 

41 .. 

2819 

2833 

2837 

2843 

2851 

2857 

2861 

2879 

2887 

2897 

42 .. 

2903 

2909 

2917 

2927 

2939 

2953 

2957 

2963 

2969 

2971 

43 .. 

2999 

3001 

3011 

3019 

3023 

3037 

3041 

3049 

3061 

3067 

44 .. 

3079 

3083 

3089 

3109 

3119 

3121 

3137 

3163 

3167 

3169 

45 .. 

3181 

3187 

3191 

3203 

3209 

3217 

3221 

3229 

3251 

3253 

46 .. 

3257 

3259 

3271 

3299 

3301 

3307 

3313 

3319 

3323 

3329 

47 .. 

3331 

3343 

3347 

3359 

3361 

3371 

3373 

3389 

3391 

3407 

48 .. 

3413 

3433 

3449 

3457 

3461 

3463 

3467 

3469 

3491 

3499 

49 .. 

3511 

3517 

3527 

3529 

3533 

3539 

3541 

3547 

3557 

3559 

50 .. 

3571 

3581 

3583 

3593 

3607 

3613 

3617 

3623 

3631 

3637 

51 .. 

3643 

3659 

3671 

3673 

3677 

3691 

3697 

3701 

3709 

3719 

52 .. 

3727 

3733 

3739 

3761 

3767 

3769 

3779 

3793 

3797 

3803 

53 .. 

3821 

3823 

3833 

3847 

3851 

3853 

3863 

3877 

3881 

3889 

54 .. 

3907 

3911 

3917 

3919 

3923 

3929 

3931 

3943 

3947 

3967 

55 .. 

3989 

4001 

4003 

4007 

4013 

4019 

4021 

4027 

4049 

4051 

56 .. 

4057 

4073 

4079 

4091 

4093 

4099 

4111 

4127 

4129 

4133 

57 .. 

4139 

4153 

4157 

4159 

4177 

4201 

4211 

4217 

4219 

4229 

58 .. 

4231 

4241 

4243 

4253 

4259 

4261 

4271 

4273 

4283 

4289 

59 .. 

4297 

4327 

4337 

4339 

4349 

4357 

4363 

4373 

4391 

4397 

60 .. 

4409 

4421 

4423 

4441 

4447 

4451 

4457 

4463 

4481 

4483 

61 .. 

4493 

4507 

4513 

4517 

4519 

4523 

4547 

4549 

4561 

4567 

62 .. 

4583 

4591 

4597 

4603 

4621 

4637 

4639 

4643 

4649 

4651 

63 .. 

4657 

4663 

4673 

4679 

4691 

4703 

4721 

4723 

4729 

4733 

64 .. 

4751 

4759 

4783 

4787 

4789 

4793 

4799 

4801 

4813 

4817 

65 .. 

4831 

4861 

4871 

4877 

4889 

4903 

4909 

4919 

4931 

4933 

66 .. 

4937 

4943 

4951 

4957 

4967 

4969 

4973 

4987 

4993 

4999 

67 .. 

5003 

5009 

5011 

5021 

5023 

5039 

5051 

5059 

5077 

5081 

68 .. 

5087 

5099 

5101 

5107 

5113 

5119 

5147 

5153 

5167 

5171 

69 .. 

5179 

5189 

5197 

5209 

5227 

5231 

5233 

5237 

5261 

5273 

70 .. 

5279 

5281 

5297 

5303 

5309 

5323 

5333 

5347 

5351 

5381 

71 .. 

5387 

5393 

5399 

5407 

5413 

5417 

5419 

5431 

5437 

5441 

72 .. 

5443 

5449 

5471 

5477 

5479 

5483 

5501 

5503 

5507 

5519 

73 .. 

5521 

5527 

5531 

5557 

5563 

5569 

5573 

5581 

5591 

5623 

74 .. 

5639 

5641 

5647 

5651 

5653 

5657 

5659 

5669 

5683 

5689 

75 .. 

5693 

5701 

5711 

5717 

5737 

5741 

5743 

5749 

5779 

5783 

76 .. 

5791 

5801 

5807 

5813 

5821 

5827 

5839 

5843 

5849 

5851 

77 .. 

5857 

5861 

5867 

5869 

5879 

5881 

5897 

5903 

5923 

5927 

78 .. 

5939 

5953 

5981 

5987 

6007 

6011 

6029 

6037 

6043 

6047 

79 .. 

6053 

6067 

6073 

6079 

6089 

6091 

6101 

6113 

6121 

6131 

80 .. 

6133 

6143 

6151 

6163 

6173 

6197 

6199 

6203 

6211 

6217 
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..0 

..1 

..2 

..3 

..4 

..5 

..6 

..7 

..8 

..9 

81 .. 

6221 

6229 

6247 

6257 

6263 

6269 

6271 

6277 

6287 

6299 

82 .. 

6301 

6311 

6317 

6323 

6329 

6337 

6343 

6353 

6359 

6361 

83 .. 

6367 

6373 

6379 

6389 

6397 

6421 

6427 

6449 

6451 

6469 

84 .. 

6473 

6481 

6491 

6521 

6529 

6547 

6551 

6553 

6563 

6569 

85 .. 

6571 

6577 

6581 

6599 

6607 

6619 

6637 

6653 

6659 

6661 

86 .. 

6673 

6679 

6689 

6691 

6701 

6703 

6709 

6719 

6733 

6737 

87 .. 

6761 

6763 

6779 

6781 

6791 

6793 

6803 

6823 

6827 

6829 

88 .. 

6833 

6841 

6857 

6863 

6869 

6871 

6883 

6899 

6907 

6911 

89 .. 

6917 

6947 

6949 

6959 

6961 

6967 

6971 

6977 

6983 

6991 

90 .. 

6997 

7001 

7013 

7019 

7027 

7039 

7043 

7057 

7069 

7079 

91 .. 

7103 

7109 

7121 

7127 

7129 

7151 

7159 

7177 

7187 

7193 

92 .. 

7207 

7211 

7213 

7219 

7229 

7237 

7243 

7247 

7253 

7283 

93 .. 

7297 

7307 

7309 

7321 

7331 

7333 

7349 

7351 

7369 

7393 

94 .. 

7411 

7417 

7433 

7451 

7457 

7459 

7477 

7481 

7487 

7489 

95 .. 

7499 

7507 

7517 

7523 

7529 

7537 

7541 

7547 

7549 

7559 

96 .. 

7561 

7573 

7577 

7583 

7589 

7591 

7603 

7607 

7621 

7639 

97 .. 

7643 

7649 

7669 

7673 

7681 

7687 

7691 

7699 

7703 

7717 

98 .. 

7723 

7727 

7741 

7753 

7757 

7759 

7789 

7793 

7817 

7823 

99 .. 

7829 

7841 

7853 

7867 

7873 

7877 

7879 

7883 

7901 

7907 

100 .. 

7919 

7927 

7933 

7937 

7949 

7951 

7963 

7993 

8009 

8011 

101 .. 

8017 

8039 

8053 

8059 

8069 

8081 

8087 

8089 

8093 

8101 

102 .. 

8111 

8117 

8123 

8147 

8161 

8167 

8171 

8179 

8191 

8209 

103 .. 

8219 

8221 

8231 

8233 

8237 

8243 

8263 

8269 

8273 

8287 

104 .. 

8291 

8293 

8297 

8311 

8317 

8329 

8353 

8363 

8369 

8377 

105 .. 

8387 

8389 

8419 

8423 

8429 

8431 

8443 

8447 

8461 

8467 

106 .. 

8501 

8513 

8521 

8527 

8537 

8539 

8543 

8563 

8573 

8581 

107 .. 

8597 

8599 

8609 

8623 

8627 

8629 

8641 

8647 

8663 

8669 

108 .. 

8677 

8681 

8689 

8693 

8699 

8707 

8713 

8719 

8731 

8737 

109 .. 

8741 

8747 

8753 

8761 

8779 

8783 

8803 

8807 

8819 

8821 

110 .. 

8831 

8837 

8839 

8849 

8861 

8863 

8867 

8887 

8893 

8923 

111 .. 

8929 

8933 

8941 

8951 

8963 

8969 

8971 

8999 

9001 

9007 

112 .. 

9011 

9013 

9029 

9041 

9043 

9049 

9059 

9067 

9091 

9103 

113 .. 

9109 

9127 

9133 

9137 

9151 

9157 

9161 

9173 

9181 

9187 

114 .. 

9199 

9203 

9209 

9221 

9227 

9239 

9241 

9257 

9277 

9281 

115 .. 

9283 

9293 

9311 

9319 

9323 

9337 

9341 

9343 

9349 

9371 

116 .. 

9377 

9391 

9397 

9403 

9413 

9419 

9421 

9431 

9433 

9437 

117 .. 

9439 

9461 

9463 

9467 

9473 

9479 

9491 

9497 

9511 

9521 

118 .. 

9533 

9539 

9547 

9551 

9587 

9601 

9613 

9619 

9623 

9629 

119 .. 

9631 

9643 

9649 

9661 

9677 

9679 

9689 

9697 

9719 

9721 

120 .. 

9733 

9739 

9743 

9749 

9767 

9769 

9781 

9787 

9791 

9803 

121 .. 

9811 

9817 

9829 

9833 

9839 

9851 

9857 

9859 

9871 

9883 

122 .. 

9887 

9901 

9907 

9923 

9929 

9931 

9941 

9949 

9967 

9973 
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Table 2 Prime power decompositions below 2500. 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

0 



2 

3 

2 2 

5 

2-3 

7 

2 3 

3 2 

1 

2-5 

11 

2 2 -3 

13 

2-7 

3-5 

2 4 

17 

2-3 2 

19 

2 

2 2 -5 

3-7 

2-11 

23 

2 3 -3 

5 2 

2-13 

3 3 

2 2 -7 

29 

3 

2-3-5 

31 

2 5 

3-11 

2-17 

5-7 

2 2 -3 2 

37 

2-19 

3-13 

4 

2 3 -5 

41 

2-3-7 

43 

2 2 -ll 

3 2 -5 

2-23 

47 

2 4 -3 

7 2 

5 

2-5 2 

3-17 

2 2 -13 

53 

2-3 3 

5-11 

2 3 -7 

3-19 

2-29 

59 

6 

2 2 -3-5 

61 

2-31 

3 2 -7 

2 6 

5-13 

2-3-11 

67 

2 2 -17 

3-23 

7 

2-5-7 

71 

2 3 -3 2 

73 

2-37 

3-5 2 

2 2 -19 

7-11 

2-3-13 

79 

8 

2 4 -5 

3 4 

2-41 

83 

2 2 -3-7 

5-17 

2-43 

3-29 

2 3 -ll 

89 

9 

2-3 2 -5 

7-13 

2 2 -23 

3-31 

2-47 

5-19 

2 5 -3 

97 

2-7 2 

3 2 -ll 

10 

2 2 -5 2 

101 

2-3-17 

103 

2 3 -13 

3-5-7 

2-53 

107 

2 2 -3 3 

109 

11 

2-5-11 

3-37 

2 4 -7 

113 

2-3-19 

5-23 

2 2 -29 

3 2 -13 

2-59 

7-17 

12 

2 3 -3-5 

ll 2 

2-61 

3-41 

2 2 -31 

5 3 

2-3 2 -7 

127 

2 7 

3-43 

13 

2-5-13 

131 

2 2 -3-ll 

7-19 

2-67 

3 3 -5 

2 3 -17 

137 

2-3-23 

139 

14 

2 2 -5-7 

3-47 

2-71 

11-13 

2 4 -3 2 

5-29 

2-73 

3-7 2 

2 2 -37 

149 

15 

2-3-5 2 

151 

2 3 -19 

3 2 -17 

2-7-11 

5-31 

2 2 -3-13 

157 

2-79 

3-53 

16 

2 5 -5 

7-23 

2-3 4 

163 

2 2 -41 

3-5-11 

2-83 

167 

2 3 -3-7 

13 2 

17 

2-5-17 

3 2 -19 

2 2 -43 

173 

2-3-29 

5 2 -7 

2 4 -ll 

3-59 

2-89 

179 

18 

2 2 -3 2 -5 

181 

2-7-13 

3-61 

2 3 -23 

5-37 

2-3-31 

11-17 

2 2 -47 

3 3 -7 

19 

2-5-19 

191 

2 6 -3 

193 

2-97 

3-5-13 

2 2 -7 2 

197 

2-3 2 -ll 

199 

20 

2 3 -5 2 

3-67 

2-101 

7-29 

2 2 -3-17 

5-41 

2-103 

3 2 -23 

2 4 -13 

11-19 

21 

2-3-5-7 

211 

2 2 -53 

3-71 

2-107 

5-43 

2 3 -3 3 

7-31 

2-109 

3-73 

22 

2 2 -5-ll 

13-17 

2-3-37 

223 

2 5 -7 

3 2 -5 2 

2-113 

227 

2 2 -3-19 

229 

23 

2-5-23 

3-7-11 

2 3 -29 

233 

2-3 2 -13 

5-47 

2 2 -59 

3-79 

2-7-17 

239 

24 

2 4 -3-5 

241 

2-11 2 

3 5 

2 2 -61 

5-7 2 

2-3-41 

13-19 

2 3 -31 

3-83 

25 

2-5 3 

251 

2 2 -3 2 -7 

11-23 

2-127 

3-5-17 

2 8 

257 

2-3-43 

7-37 

26 

2 2 -5-13 

3 2 -29 

2-131 

26 3 2 3 -3-ll 

5-53 

2-7-19 

3-89 

2 2 -67 

269 

27 

2-3 3 -5 

271 

2 4 -17 

3-7-13 

2-137 

5 2 -ll 

2 2 -3-23 

277 

2-139 

3 2 -31 

28 

2 3 -5-7 

281 

2-3-47 

283 

2 2 -71 

3-5-19 

2-11-13 

7-41 

2 5 -3 2 

17 2 

29 

2-5-29 

3-97 

2 2 -73 

293 

2-3-7 2 

5-59 

2 3 -37 

3 3 -ll 

2-149 

13-23 

30 

2 2 -3-5 2 

7-43 

2-151 

3-101 

2 4 -19 

5-61 

2-3 2 -17 

307 

2 2 -7-ll 

3-103 

31 

2-5-31 

311 2 3 -3-13 

313 

2-157 

3 2 -5-7 

2 2 -79 

317 

2-3-53 

11-29 

32 

2 6 -5 

3-107 

2-7-23 

17-19 

2 2 -3 4 

5 2 -13 

2-163 

3-109 

2 3 -41 

7-47 

33 

2-3-5-11 

331 

2 2 -83 

3 2 -37 

2-167 

5-67 

2 4 -3-7 

337 

2-13 2 

3-113 

34 

2 2 -5-17 

11-31 

2-3 2 -19 

7 3 

2 3 -43 

3-5-23 

2-173 

347 

2 2 -3-29 

349 

35 

2-5 2 -7 

3 3 -13 

2 5 -ll 

353 

2-3-59 

5-71 

2 2 -89 

3-7-17 

2-179 

359 

36 

2 3 -3 2 -5 

19 2 

2-181 

3-11 2 

2 2 -7-13 

5-73 

2-3-61 

367 

2 4 -23 

3 2 -41 

37 

2-5-37 

7-53 

2 2 -3-31 

373 

2-11-17 

3-5 3 

2 3 -47 

13-29 

2-3 3 -7 

379 

38 

2 2 -5-19 

3-127 

2-191 

383 

2 7 -3 

5-7-11 

2-193 

3 2 -43 

2 2 -97 

389 

39 

2-3-5-13 

17-23 

2 3 -7 2 

3-131 

2-197 

5-79 2 2 -3 2 -ll 

397 

2-199 

3-7-19 

40 

2 4 -5 2 

401 

2-3-67 

13-31 

2 2 -101 

3 4 -5 

2-7-29 

11-37 2 3 -3-17 

409 

41 

2-5-41 

3-137 

2 2 -103 

7-59 

2-3 2 -23 

5-83 

2 5 -13 

3-139 

2-11-19 

419 

42 

2 2 -3-5-7 

421 

2-211 

3 2 -47 

2 3 -53 

5 2 -17 

2-3-71 

7-61 

2 2 -107 3-11-13 

43 

2-5-43 

431 

2 4 -3 3 

433 

2-7-31 

3-5-29 

2 2 -109 

19-23 

2-3-73 

439 

44 

2 3 -5-ll 

3 2 -7 2 

2-13-17 

443 

2 2 -3-37 

5-89 

2-223 

3-149 

2 6 -7 

449 

45 

2-3 2 -5 2 

11-41 

2 2 -113 

3-151 

2-227 

5-7-13 

2 3 -3-19 

457 

2-229 

3 3 -17 


© 2000 by CRC Press LLC 





0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

46 

2 2 -5-23 

461 2-3-7-11 

463 

2 4 -29 

3-5-31 

2-233 

467 

2 2 -3 2 -13 

7-67 

47 

2-5-47 

3-157 

2 3 -59 

11-43 

2-3-79 

5 2 -19 

2 2 -7-17 

3 2 -53 

2-239 

479 

48 

2 5 -3-5 

13-37 

2-241 

3-7-23 

2 2 -ll 2 

5-97 

2-3 5 

487 

2 3 -61 

3-163 

49 

2-5-7 2 

491 

2 2 -3-41 

17-29 

2-13-19 3 2 -5-ll 

2 4 -31 

7-71 

2-3-83 

499 

50 

2 2 -5 3 

3-167 

2-251 

503 

2 3 -3 2 -7 

5-101 

2-11-23 

3-13 2 

2 2 -127 

509 

51 

2-3-5-17 

7-73 

2 9 

3 3 -19 

2-257 

5-103 

2 2 -3-43 

11-47 

2-7-37 

3-173 

52 

2 3 -5-13 

521 

2-3 2 -29 

523 

2 2 -131 

3-5 2 -7 

2-263 

17-31 

2 4 -3-ll 

23 2 

53 

2-5-53 

3 2 -59 

2 2 -7-19 

13-41 

2-3-89 

5-107 

2 3 -67 

3-179 

2-269 

7 2 -ll 

54 

2 2 -3 3 -5 

541 

2-271 

3-181 

2 5 -17 

5-109 2-3-7-13 

547 

2 2 -137 

3 2 -61 

55 

2-5 2 -ll 

19-29 

2 3 -3-23 

7-79 

2-277 

3-5-37 

2 2 -139 

557 

2-3 2 -31 

13-43 

56 

2 4 -5-7 3-11-17 

2-281 

563 

2 2 -3-47 

5-113 

2-283 

3 4 -7 

2 3 -71 

569 

57 

2-3-5-19 

571 2 2 -ll-13 

3-191 

2-7-41 

5 2 -23 

2 6 -3 2 

577 

2-17 2 

3-193 

58 

2 2 -5-29 

7-83 

2-3-97 

11-53 

2 3 -73 3 2 -5-13 

2-293 

587 

2 2 -3-7 2 

19-31 

59 

2-5-59 

3-197 

2 4 -37 

593 

2-3 3 -ll 

5-7-17 

2 2 -149 

3-199 

2-13-23 

599 

60 

2 3 -3-5 2 

601 

2-7-43 

3 2 -67 

2 2 -151 

5-11 2 

2-3-101 

607 

2 5 -19 

3-7-29 

61 

2-5-61 

13-47 2 2 -3 2 -17 

613 

2-307 

3-5-41 

2 3 -7-ll 

617 

2-3-103 

619 

62 

2 2 -5-31 

3 3 -23 

2-311 

7-89 

2 4 -3-13 

5 4 

2-3133-11-19 

2 2 -157 

17-37 

63 

2-3 2 -5-7 

631 

2 3 -79 

3-211 

2-317 

5-127 

2 2 -3-53 

7 2 -13 

2-11-29 

3 2 -71 

64 

2 7 -5 

641 

2-3-107 

643 

2 2 -7-23 

3-5-43 

2-17-19 

647 

2 3 -3 4 

11-59 

65 

2-5 2 -13 

3-7-31 

2 2 -163 

653 

2-3-109 

5-131 

2 4 -41 

3 2 -73 

2-7-47 

659 

66 

2 2 -3-5-ll 

661 

2-3313-13-17 

2 3 -83 

5-7-19 

2-3 2 -37 

23-29 

2 2 -167 

3-223 

67 

2-5-67 

11-61 

2 5 -3-7 

673 

2-337 

3 3 -5 2 

2 2 -13 2 

677 

2-3-113 

7-97 

68 

2 3 -5-17 

3-227 

2-11-31 

683 2 2 -3 2 -19 

5-137 

2-7 3 

3-229 

2 4 -43 

13-53 

69 

2-3-5-23 

691 

2 2 -173 3 2 -7-ll 

2-347 

5-139 

2 3 -3-29 

17-41 

2-349 

3-233 

70 

2 2 -5 2 -7 

701 

2-3 3 -13 

19-37 

2 6 -ll 

3-5-47 

2-353 

7-101 

2 2 -3-59 

709 

71 

2-5-71 

3 2 -79 

2 3 -89 

23-31 2-3-7-17 5-11-13 

2 2 -179 

3-239 

2-359 

719 

72 

2 4 -3 2 -5 

7-103 

2-19 2 

3-241 

2 2 -181 

5 2 -29 

2-3-11 2 

727 

2 3 -7-13 

3 6 

73 

2-5-73 

17-43 

2 2 -3-61 

733 

2-367 

3-5-7 2 

2 5 -23 

11-67 

2-3 2 -41 

739 

74 

2 2 -5-37 3-13-19 

2-7-53 

743 

2 s -3-31 

5-149 

2-373 

3 2 -83 

2 2 -ll-17 

7-107 

75 

2-3-5 3 

751 

2 4 -47 

3-251 

2-13-29 

5-151 

2 2 -3 3 -7 

757 

2-379 3-11-23 

76 

2 3 -5-19 

761 

2-3-127 

7-109 

2 2 -191 3 2 -5-17 

2-383 

13-59 

2 s -3 

769 

77 

2-5-7-11 

3-257 

2 2 -193 

773 

2-3 2 -43 

5 2 -31 

2 3 -97 

3-7-37 

2-389 

19-41 

78 

2 2 -3-5-13 

11-71 

2-17-23 

3 3 -29 

2 4 -7 2 

5-157 

2-3-131 

787 

2 2 -197 

3-263 

79 

2-5-79 

7-113 2 3 -3 2 -ll 

13-61 

2-397 

3-5-53 

2 2 -199 

797 

2-3-7-19 
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7-293 2 2 -3 3 -19 

2053 

2-13-79 

3-5-137 

2 3 -257 11 2 -17 

2-3-7 3 

29-71 

206 

2 2 -5-103 

3 2 -229 2-1031 

2063 

2 4 -3-43 

5-7-59 

2-10333-13-53 2 2 -ll-47 

2069 

207 

2-3 2 -5-23 

19-109 2 3 -7-37 

3-691 

2-17-61 

5 2 -83 2 2 -3-173 31-67 

2-1039 3 3 -7-ll 

208 

2 5 -5-13 

2081 2-3-347 

2083 

2 2 -521 

3-5-139 

2-7-149 2087 2 3 -3 2 -29 

2089 

209 

2-5-11-19 3-17-41 2 2 -523 7-13-23 

2-3-349 

5-419 

2 4 -131 3 2 -233 

2-1049 

2099 

210 

2 2 -3-5 2 -7 

11-191 2-1051 

3-701 

2 3 -263 

5-421 

2-3 4 -13 7 2 -43 2 2 -17-31 3-19-37 

211 

2-5-211 

2111 2 6 -3-ll 

2113 

2-7-151 

3 2 -5-47 

2 2 -23 2 29-73 

2-3-353 

13-163 

212 

2 3 -5-53 3-7-101 2-1061 

11-193 2 2 -3 2 -59 

5 3 -17 

2-1063 3-709 

2 4 -7-19 

2129 

213 

2-3-5-71 

2131 2 2 -13-41 

3 3 -79 

2-11-97 

5-7-61 

2 3 -3-89 2137 

2-1069 3-23-31 

214 

2 2 -5-107 

21412-3 2 -7-17 

2143 

2 5 -673-5-ll-13 

2-29-37 19-113 2 2 -3-179 

7-307 

215 

2-5 2 -43 

3 2 -239 2 3 -269 

2153 

2-3-359 

5-431 

2 2 -7 2 -ll 3-719 

2-13-83 

17-127 

216 

2 4 -3 3 -5 

2161 2-23-47 3-7-103 

2 2 -541 

5-433 

2-3-19 2 11-197 

2 3 -271 

3 2 -241 

217 

2-5-7-31 

13-167 2 2 -3-181 

41-53 

2-1087 

3-5 2 -29 

2 7 -17 7-311 2-3 2 -ll 2 

2179 

218 

2 2 -5-109 

3-727 2-1091 

37-592 3 -3-7-13 

5-19-23 

2-1093 3 7 

2 2 -547 

11-199 

219 

2-3-5-73 

7-313 2 4 -137 3-17-43 

2-1097 

5-439 

2 2 -3 2 -61 13 3 

2-7-157 

3-733 

220 

2 3 -5 2 -ll 

31-71 2-3-367 

2203 2 2 -19-29 

3 2 -5-7 2 

2-1103 2207 

2 5 -3-23 

47 2 

221 

2-5-13-17 3-11-67 2 2 -7-79 

2213 

2-3 3 -41 

5-443 

2 3 -277 3-739 

2-1109 

7-317 

222 

2 2 -3-5-37 

2221 2-11-1013 

; 2 -13-19 

2 4 -139 

5 2 -89 

2-3-7-53 17-131 

2 2 -557 

3-743 

223 

2-5-223 

23-97 2 3 -3 2 -31 7-11-29 

2-1117 

3-5-149 2 2 -13-43 2237 

2-3-373 

2239 

224 

2 6 -5-7 

3 3 -83 2-19-59 

22432 2 -3-ll-17 

5-449 

2-1123 3-7-107 

2 3 -281 

13-173 

225 

2-3 2 -5 3 

2251 2 2 -563 

3-751 

2-7 2 -23 

5-11-41 

2 4 -3-47 37-61 

2-1129 

3 2 -251 
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0 

1 

2 

3 

4 5 

6 

7 

8 

9 

226 

2 2 -5-113 

7-17-19 2-3-13-29 

31-73 

2 3 -283 3-5-151 

2-11-103 

2267 

2 2 -3 4 -7 

2269 

227 

2-5-227 

3-757 

2 5 -71 

2273 

2-3-379 5 2 -7-13 

2 2 -5693 2 -ll-23 

2-17-67 

43-53 

228 

2 3 -3-5-19 

2281 

2-7-163 

3-761 

2 2 -571 5-457 

2-3 2 -127 

2287 

2 4 -ll-13 

3-7-109 

229 

2-5-229 

29-79 

2 2 -3-191 

2293 

2-31-37 3 3 -5-17 

2 3 -7-41 

2297 

2-3-383 

11 2 -19 

230 

2 2 -5 2 -23 

3-13-59 

2-1151 

72.47 

2 8 -3 2 5-461 

2-1153 

3-769 

2 2 -577 

2309 

231 

2-3-5-7-11 

2311 

2 3 -17 2 

3 2 -257 

2-13-89 5-463 

2 2 -3-193 

7-331 

2-19-61 

3-773 

232 

2 4 -5-29 

11-211 

2-3 3 -43 

23-101 

2 2 -7-83 3-5 2 -31 

2-1163 

13-179 

2 3 -3-97 

17-137 

233 

2-5-233 

3 2 -7-37 

2 2 -ll-53 

2333 

2-3-389 5-467 

2 5 -73 

3-19-41 

2-7-167 

2339 

234 

2 2 -3 2 -5-13 

2341 

2-1171 

3-11-71 

2 3 -293 5-7-672-3-17-23 

2347 

2 2 -587 

3 4 -29 

235 

2-5 2 -47 

2351 

2 4 -3-7 2 

13-181 

2-11-107 3-5-157 

2 2 -19-31 

2357 

2-3 2 -131 

7-337 

236 

2 3 -5-59 

3-787 

2-1181 

17-139 

2 2 -3-197 5-11-43 

2-7-13 2 

3 2 -263 

2 6 -37 

23-103 

237 

2-3-5-79 

2371 

2 2 -593 

3-7-113 

2-1187 5 3 -19 

2 3 -3 3 -ll 

2377 

2-29-41 

3-13-61 

238 

2 2 -5-7-17 

2381 

2-3-397 

2383 

2 4 -149 3 2 -5-53 

2-1193 

7-11-31 

2 2 -3-199 

2389 

239 

2-5-239 

3-797 

2 3 -13-23 

2393 

2-3 2 -7-19 5-479 

2 2 -599 

3-17-47 

2-11-109 

2399 

240 

2 5 -3-5 2 

7 4 

2-1201 

3 3 -89 

2 2 -601 5-13-37 

2-3-401 

29-83 

2 3 -7-43 

3-11-73 

241 

2-5-241 

2411 

2 2 -3 2 -67 

19-127 

2-17-71 3-5-7-23 

2 4 -151 

24172-3-13-31 

41-59 

242 

2 2 -5-ll 2 

3 2 -269 

2-7-173 

2423 

2 3 -3-101 5 2 -97 

2-1213 

3-809 

2 2 -607 

7-347 

243 

2-3 5 -5 

11-13-17 

2 7 -19 

3-811 

2-1217 5-487 

2 2 -3-7-29 

2437 

2-23-53 

3 2 -271 

244 

2 3 -5-61 

24412-3-11-37 

7-349 

2 2 -13-47 3-5-163 

2-1223 

2447 

2 4 -3 2 -17 

31-79 

245 

2-5 2 -7 2 

3-19-43 

2 2 -613 

11-223 

2-3-409 5-491 

2 3 -307 

3 3 -7-13 

2-1229 

2459 

246 

2 2 -3-5-41 

23-107 

2-1231 

3-821 

2 5 -7-ll 5-17-29 

2-3 2 -137 

2467 

2 2 -617 

3-823 

247 

2-5-13-19 

7-353 2 3 -3-103 

2473 

2-12373 2 -5 2 -ll 

2 2 -619 

2477 

2-3-7-59 

37-67 

248 

2 4 -5-31 

3-827 

2-17-73 

13-191 

2 2 -3 3 -23 5-7-71 

2-11-113 

3-829 

2 3 -311 

19-131 

249 

2-3-5-83 

47-53 

2 2 -7-89 

3 2 -277 

2-29-43 5-499 

2 6 -3-13 

11-227 

2-1249 

3-7 2 -17 


Algorithm 1 : Sieve of Eratosthenes. 

make a list of the numbers from 2 to N 
i := 1 

while i < \/N 
begin 

i := i + 1 

if i is not already crossed out then cross out all proper multiples of i that 
are less than or equal to N 

end {The numbers not crossed out comprise the primes up to N} 


9 . The sieve of Eratosthenes : Eratosthenes (3rd century B.C.E.) developed Algo- 
rithm 1 for listing all prime numbers less than a fixed bound. 

10 . Prime number theorem: tt(x), when divided by tends to 1 as a; tends to 

infinity. That is, n(x) is asymptotic to as x — > oo. 

11 . The prime number theorem was first conjectured by Carl Friedrich Gauss (1777- 
1855) in 1792, and was first proved in 1896 independently by Charles de la Vallee Poussin 
(1866-1962) and Jacques Hadamard (1865-1963). They proved it in the stronger form 
\n(x) — li(a:)| < cixe' -c 2V dhgT where 

Ci and C 2 are positive constants. Their proofs 
used functions of a complex variable. The first elementary proofs (not using complex 
variables) of the prime number theorem were supplied in 1949 by Paul Erdos (1913— 
1996) and Atle Seberg. 
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12. Integration by parts shows that li (x) is asymptotic to as x — > oo. 

13. |7r(x) — li (x)| < C 3 xe _c4 ( log:c ) 3/5 ( losloga: ^ 1/5 for certain positive constants C 3 and C4. 
(I. M. Vinogradov and Nikolai Korobov, 1958.) 

14. If the Riemann hypothesis (Open Problem 1) is true, | 7 r(x) — li (x)| is bounded by 
Cyfx log x for some positive constant c. 

15. J. E. Littlewood (1885-1977) showed that ir(x) — li (x) changes sign infinitely often. 
However, no explicit number x with 7 r(x) — li (x) > 0 is known. Carter Bays and Richard 
H. Hudson have shown that such a number x exists below 1.4 x 10 316 . 

16. The largest exactly computed value of 7 r(x) is 7r(10 20 ). This value, computed by 
M. Deleglise in 1996, is about 2.23 x 10 s below li (10 20 ) . (See the following table.) 


n 

7r(10”) 

« 7r(10 ra ) — li (10") 

1 

4 

—2 

2 

25 

-5 

3 

168 

-10 

4 

1,229 

-17 

5 

9,592 

-38 

6 

78,498 

-130 

7 

664,579 

-339 

8 

5,761,455 

-754 

9 

50,847,534 

-1,701 

10 

455,052,511 

-3,104 

11 

4,118,054,813 

-11,588 

12 

37,607,912,018 

-38,263 

13 

346,065,536,839 

-108,971 

14 

3,204,941,750,802 

-314,890 

15 

29,844,570,422,669 

-1,052,619 

16 

279,238,341,033,925 

-3,214,632 

17 

2,623,557,157,654,233 

-7,956,589 

18 

24,739,954,287,740,860 

-21,949,555 

19 

234,057,667,276,344,607 

-99,877,775 

20 

2,220,819,602,560,918,840 

-223,744,644 


17. Dirichlet’s theorem on primes in arithmetic progressions: Given coprime inte- 
gers a, b with b positive, there are infinitely many primes p = a (mod b). G. L. Dirichlet 
proved this in 1837. 

18. The number of primes p less than x such that p = a (mod b) is asymptotic to 
^jy7r(x) as x — > 00, if a and b are coprime and b is positive. (</> is the Euler phi- 
function; see §4.6.2.) 

Open Problems: 

1 . Riemann hypothesis: The Riemann hypothesis ( RH ), posed in 1859 by Bernhard 
Riemann (1826-1866), is a conjecture about the location of zeros of the Riemann zeta 
function , the function of the complex variable s defined by the series £(s) = n ~ s 

when the real part of s is >1, and defined by the formula 

C( s ) = i=T -sf™(x - |xj )x~ s ~ 1 dx 

in the larger region when the real part of s is > 0, except for the single point s = 1, 
where it remains undefined. The Riemann hypothesis asserts that all of the solutions to 
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C(s) = 0 in this larger region lie on the vertical line in the complex number plane with 
imaginary part tj . Its proof would imply a better error estimate for the prime number 
theorem. While believed to be true, it has not been proved. 

2. Extended Riemann hypothesis: There is a generalized form of the Riemann hy- 
pothesis known as the extended Riemann hypothesis ( ERH ) or the generalized Rie- 
mann hypothesis ( GRH ), which also has important consequences in number theory. 
(For example, see §4.4.4.) 

3. Hypothesis H: The hypothesis H of Andrzej Schinzel and Waclaw Sierpinski (1882- 
1969) asserts that for every collection of irreducible nonconstant polynomials fi(x ), . . . , 
fk(x) with integral coefficients and positive leading coefficients, if there is no fixed integer 
greater than 1 dividing the product f\(m) . . . fk{iTi) for all integers in, then there are 
infinitely many integers m such that each of the numbers /i(m), . . . , fk(m) is prime. The 
case when each of the polynomials is linear was previously conjectured by L. E. Dickson, 
and is known as the prime k-tuples conjecture. The only case of Hypothesis H that 
has been proved is the case of a single linear polynomial; this is Dirichlet’s theorem 
(Fact 17). The case of the two linear polynomials x and x + 2 corresponds to the twin 
prime conjecture (Open Problem 4). Among many consequences of hypothesis H is the 
assertion that there are infinitely many primes of the form m 2 + 1. 

4. Twin primes: It has been conjectured that there are infinitely many twin primes, 
that is, pairs of primes that differ by 2. 

5. Let d n denote the difference between the (n+l)st prime and the nth prime. The 
sequence d n is unbounded. The prime number theorem implies that on average d n is 
about logn. The twin prime conjecture asks whether d„ is 2 infinitely often. 

6. The best result known that shows that d n has relatively small values infinitely 
often, proved by Helmut Maier in 1988, is that d n < clog?r infinitely often, where c is 
a constant slightly smaller than | . 

7. It is conjectured that d n can be as big as log 2 n infinitely often, but not much 
bigger. Roger Baker and Glyn Harman have recently shown that d n < rr 535 for all large 
numbers n. In the other direction, Erdos and Robert Rankin have shown that d„ > 
c log n (log log n) (log log log log n) / (log log log n) 2 infinitely often . Several improvements 
have been made on the constant c, but this ungainly expression has stubbornly resisted 
improvement. 

8. Christian Golclbach (1690-1764) conjectured that every integer greater than 5 is the 
sum of three primes. 

9. Goldbach conjecture: Every even integer greater than 2 is a sum of two primes. 
(This is equivalent to the conjecture Goldbach made in Open Problem 8.) 

• Matti Sinisalo, in 1993, verified the Goldbach conjecture up to 4 x 10 11 . It 

has since been verified up to 1.615 x 10 12 by J. M. Deshouillers, G. Efhnger, 
H. J. J. teRiele, and D. Zinoviev. 

• In 1937 Vinogradov proved that every sufficiently large odd number is the sum 

of three primes. In 1989 J. R. Chen and T. Z. Wang showed that this is true for 
every odd number greater than 10 43,001 . In 1998 Y. Saouter showed that this is 
true for every odd number below 10 2 °. Zinoviev showed in 1996 that it is true 
for the remaining odd numbers between 10 2 ° and lO 43 ’ 001 under the assumption 
of the ERH (Open Problem 2). 

• In 1966 J. R. Chen proved that every sufficiently large even number is either the 

sum of two primes or the sum of a prime and a number that is the product of 
two primes. 
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Examples: 

1. A method for showing that there are infinitely many primes is to note that the 
integer n! + l must have a prime factor greater than n, so there is no largest prime. Note 
that n! + 1 is prime for n = 1,2, 3, 11, 27, 37, 41, 73, 77, 116, 154, 320, 340, 399, and 427, 
but is composite for all numbers less than 427 not listed. 

2. Let Q{p) ( p a prime) equal one more than the product of the primes not ex- 
ceeding p. For example Q(5) = 2 • 3 • 5 + 1 = 31. Then Q(p ) is prime for p = 
2,3,5,7,11,31,379,1019,1021,2657,3229,4547,4787,11549,13649; it is composite for 
all p < 11213 not in this list. For example, Q(13) = 2 • 3 • 5 • 7 • 11 • 13 + 1 is composite. 

3. There are six primes not exceeding 16, namely 2,3,5,7,11, and 13. Hence 7r(16) = 6. 

4. The expression n 2 + 1 is prime for n = 1,2, 4, 6, 10, . . . , but it is unknown whether 
there are infinitely many primes of this form when n is an integer. (See Open Problem 3.) 

5. The polynomial f(n ) = n 2 + n + 41 takes on prime values for n = 0, 1, 2, . . . , 39, but 
/(40) = 1681 = 41 2 . 

6. Applying Dirichlet’s theorem with a = 123 and b = 1,000, there are infinitely many 
primes that end in the digits 123. The first such prime is 1,123. 

7. The pairs 17, 19 and 191, 193 are twin primes. The largest known twin primes have 
11,755 decimal digits. They are 361,700,055 x 2 39,020 ± 1 and were found in 1999 by 
Henri Lifchitz. 


4.4.3 NUMBERS OF SPECIAL FORM 

Numbers of the form b n ± 1, for b a small number, are often easier to factor or test for 
primality than other numbers of the same size. They also have a colorful history. 

Definitions: 

A Cunningham number is a number of the form b n ± 1, where b and n are natural 
numbers, and b is “small” — 2, 3, 5, 6, 7, 10, 11, or 12. They are named after Allan 
Cunningham, who, along with H. J. Woodall, published in 1925 a table of factorizations 
of many of these numbers. 

A Fermat number F m is a Cunningham number of the form 2 2T " + 1. (See Table 4.) 
A Fermat prime is a Fermat number that is prime. 

A Mersenne number M n is a Cunningham number of the form 2" — 1. 

A Mersenne prime is a Mersenne number that is prime. (See Table 3.) 

The cyclotomic polynomials are defined recursively by the equation x n — 1 = 

*-(*)■ 

A perfect number is a positive integer that is equal to the sum of all its proper 
divisors. 

Facts: 

1. If M n is prime, then n is prime, but the converse is not true. 

2. If b > 2 or n is composite, then a nontrivial factorization of b n — 1 is given by 
b n — 1 = n d |„ ®d{b), though the factors *F d (6) are not necessarily primes. 

3. The number b n + 1 can be factored as the product of 4> d (6), where d runs over the 
divisors of 2 n that are not divisors of n. When n is not a power of 2 and b > 2, this 
factorization is nontrivial. 
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Algorithm 2: Lucas-Lehmer test. 

p := an odd prime; u := 4; * := 0 

while i < p — 2 
begin 

i := i + 1 

u := u 2 — 2 mod 2 P — 1 

end 

{if u = 0 then 2 P — 1 is prime, else 2 P — 1 is composite} 


4. Some numbers of the form b n ± 1 also have so-called Aurifeuillian factorizations, 
named after A. Aurifeuille. For more details, see [BrEtal88]. 

5. The only primes of the form b n — 1 (with n > 1) are Mersenne primes. 

6. The only primes of the form 2" + 1 are Fermat primes. 

7. Fermat numbers are named after Pierre de Fermat (1601-1695), who observed that 

Fq, Fi, F 2 , F 3 and F 4 are prime and stated (incorrectly) that all such numbers are 

prime. Euler proved this was false, by showing that F 5 = 2 32 + 1 = 641 x 6,700,41 7. 

8. F 4 is the largest known Fermat prime. It is conjectured that all larger Fermat 
numbers are composite. 

9. The smallest Fermat number that has not yet been completely factored is F 12 = 

2 2 + 1, which has a 1187-digit composite factor. 

10. In 1994 it was shown that F 22 is composite. There are 141 values of n > 22 where 
a (relatively) small prime factor of F n is known. In none of these cases do we know 
whether the remaining factor of F n is prime or composite. Currently, F24 is the smallest 
Fermat number that has not been proved prime or shown to be composite. For up-to- 
date information about the factorization of Fermat numbers (maintained by Wilfrid 
Keller) consult http : //vamri . xray . uf 1 . edu/proths/f ermat .html#Prime . 

11. Pepin’s criterion: For m > 1, F m is prime if and only if 3 ( f *t*-i )/2 = —1 (mod 
F m ). 

12. For m > 2, every factor of F m is of the form 2 m+2 £; + 1. 

13. Mersenne numbers are named after Marin Mersenne (1588-1648), who made a list 
of what he thought were all the Mersenne primes M. p with p < 257. His list consisted of 
the primes p = 2,3, 5, 7, 13, 17, 19, 31, 67, 127, and 257. However, it was later shown 
that M 6 7 and M257 are composite, while M 6 1, M s 9, and M 10 7, missing from the list, 
are prime. 

14. It is not known whether there are infinitely many Mersenne primes, nor whether 
infinitely many Mersenne numbers with prime exponent are composite, though it is 
conjectured that both are true. 

15. Euclid showed that the product of a Mersenne prime 2 P — 1 with 2 P ~ 1 is perfect. 
Euler showed that every even perfect number is of this form. It is not known whether 
any odd perfect numbers exist. There are none below IO 300 , a result of R. P. Brent, 
G. L. Cohen and H. J. J. teRiele in 1991. 

16. The Lucas-Lehmer test can be used to determine whether a given Mersenne number 
is prime or composite. (See Algorithm 2.) 

17. Table 3 lists all known Mersenne primes. The largest known Mersenne prime is 

2 6,972,593 _ 1 when 

a new Mersenne prime is found by computer, there may be other 
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numbers of the form M p less than this prime not yet checked for primality. It can take 
months, or even years, to do this checking. A new Mersenne prime may even be found 
this way, as was the case for the 29th. 

18 . George Woltman launched the Great Internet Mersenne Prime Search (GIMPS) in 
1996. GIMPS provides free software for PCs. GIMPS has played a role in discovering the 
last four Mersenne primes. Thousands of people participate in GIMPS over PrimeNet, 
a virtual supercomputer of distributed PCs, together running more than 0.7 Teraflops, 
the equivalent of more than a dozen of the fastest supercomputers, in the quest for 
Mersenne primes. Consult the GIMPS website at http://www.mersenne.org and the 
PrimeNet site at http://entropia.com/ips/ for more information about this quest 
and how to join it. 

19 . As of 1999, the two smallest composite Mersenne numbers not completely factored 
were 2 617 - 1, and 2 619 - 1. 

20 . The best reference for the history of the factorization of Cunningham numbers is 
[BrEtal88] . 

21 . The current version of the Cunningham table, maintained by Sam Wagstaff, can 
be found at http : / /www . cs . purdue . edu/homes/ssw/ cun/ index . html 

22 . In Table 4, pk indicates a fc-digit prime, and Ck indicates a fc-digit composite. All 
other numbers in the right column have been proved prime. 

Examples: 

1. The Mersenne number Mu = 2 11 — 1 is not prime since Mu = 23 • 89. 

2 . To factor 342 = 7 3 - 1 note that 7 3 - 1 = (7 - 1)(7 2 + 7 + 1) = 6 x 57. 

3 . To factor 3 7 + 1 note that 3 7 + 1 = 4>2(3)<I>i4(3) = 4 x 547. 

4 . An example of an Aurifeuillian factorization is given by 2 4fe_2 + 1 = (2 2fe ~ 1 — 2 fe + 1) • 

(2 2fe_1 + 2 fe + 1). 

5. dq©) = x — 1 and x 3 — 1 = dq©)^©), so <I >3 (rc) = ( x 3 — l)/dq(a;) = x 2 + x + 1. 


4.4.4 PSEUDOPRIMES AND PRIMALITY TESTING 
Definitions: 

A pseudoprime to the base b is a composite number n such that b n =b (mod n) . 

A pseudoprime is a pseudoprime to the base 2. 

A Carmichael number is a pseudoprime to all bases. 

A strong pseudoprime to the base b is an odd composite number n = 2 s d+l, with d 
odd, and either b d = 1 (mod n) or b 2 d = —1 (mod n) for some integer r, 0 < r < s. 

A witness for an odd composite number n is a base 6, with 1 < b < n, to which n is 
not a strong pseudoprime. Thus, b is a “witness” to n being composite. 

A primality proof is an irrefutable verification that an integer is prime. 

Facts: 

1. By Fermat’s little theorem (§4.3.3), 6 P_1 = 1 (mod p) for all primes p and all 
integers b that are not multiples of p. Thus, the only numbers n > 1 with b n ~ l = 
1 (mod n ) are primes and pseudoprimes to the base b (which are coprime to b). Similarly, 
the numbers n which satisfy the strong pseudoprime congruence conditions are the odd 
primes not dividing b and the strong pseudoprimes to the base b. 

2 . The smallest pseudoprime is 341. 
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Table 3 Mersenne primes. 


n 

exponent 

decimal 

digits 

year 

discovered 

discoverer (s) ( computer used) 

i 

2 

1 

ancient times 


2 

3 

1 

ancient times 


3 

5 

2 

ancient times 


4 

7 

3 

ancient times 


5 

13 

4 

1461 

anonymous 

6 

17 

6 

1588 

Cataldi 

7 

19 

6 

1588 

Cataldi 

8 

31 

10 

1750 

Euler 

9 

61 

19 

1883 

Pervushin 

10 

89 

27 

1911 

Powers 

11 

107 

33 

1913 

Fauquembergue 

12 

127 

39 

1876 

Lucas 

13 

521 

157 

1952 

Robinson (SWAC) 

14 

607 

183 

1952 

Robinson (SWAC) 

15 

1,279 

386 

1952 

Robinson (SWAC) 

16 

2,203 

664 

1952 

Robinson (SWAC) 

17 

2,281 

687 

1952 

Robinson (SWAC) 

18 

3,217 

969 

1957 

Riesel (BESK) 

19 

4,253 

1,281 

1961 

Hurwitz (IBM 7090) 

20 

4,423 

1,332 

1961 

Hurwitz (IBM 7090) 

21 

9,689 

2,917 

1963 

Gillies (ILLIAC 2) 

22 

9,941 

2,993 

1963 

Gillies (ILLIAC 2) 

23 

11,213 

3,376 

1963 

Gillies (ILLIAC 2) 

24 

19,937 

6,002 

1971 

Tuckerman (IBM 360/91) 

25 

21,701 

6,533 

1978 

Noll and Nickel (Cyber 174) 

26 

23,209 

6,987 

1979 

Noll (Cyber 174) 

27 

44,497 

13,395 

1979 

Nelson and Slowinski (Cray 1) 

28 

86,243 

25,962 

1982 

Slowinski (Cray 1) 

29 

110,503 

33,265 

1988 

Colquitt and Welsh (NEC SX-W) 

30 

132,049 

39,751 

1983 

Slowinski (Cray X-MP) 

31 

216,091 

65,050 

1985 

Slowinski (Cray X-MP) 

32 

756,839 

227,832 

1992 

Slowinski and Gage (Cray 2) 

33 

859,433 

258,716 

1994 

Slowinski and Gage (Cray 2) 

34 

1,257,787 

378,632 

1996 

Slowinski and Gage (Cray T94) 

35 

1,398,269 

420,921 

1996 

Armengaud, Woltman, and 
team (90 MHz Pentium) 

36 

2,976,221 

895,932 

1997 

Spence, Woltman, and others 
(100 MHz Pentium) 

37 

3,021,377 

909,526 

1998 

Clarkson, Woltman, Kurowski, and 
others (200 MHz Pentium) 

38 

6,972,593 

2,098,960 

1999 

Hajratwala, Woltman, and Kurowski 
(350 MHz Pentium) 


3. There are infinitely many pseudoprimes; however, Paul Erdos has proved that pseu- 
doprimes are rare compared to primes. The same results are true for pseudoprimes to 
any fixed base b. (See [Ri96] or [CrPo99] for details.) 
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Table 4 Fermat numbers. 


nn 

known factorization of F m 

0 

3 

1 

5 

2 

17 

3 

257 

4 

65,537 

5 

641 x pi 

6 

274,177 x Pli 

7 

59,649,589,127,497,217 x p 22 

8 

1,238,926,361,552,897 x p 62 

9 

2,424,833 x 7,455,602,825,647,884,208,337,395,736,200,454,918,783,366,342,657 

XP99 

10 

45,592,577 x 6,487,031,809 

x4, 659, 775, 785, 220, 018, 543, 264, 560, 743, 076, 778, 192, 897 x p 2 52 

11 

319,489 x 974,849 x 167,988,556,341,760,475,137 
x3, 560, 841, 906, 445, 833, 920, 513 x p 564 

12 

114,689 x 26,017,793 x 63,766,529 x 190,274,191,361 
xl, 256, 132, 134, 125, 569 x c M87 

13 

2,710,954,639,361 x 2,663,848,877,152,141,313 

x3, 603, 109, 844, 542, 291, 969 x 319,546,020,820,551,643,220,672,513 x c 2>3 gi 

14 

c 4933 

15 

1,214,251,009 x 2,327,042,503,868,417 x c 984 o 

16 

825,753,601 x ci 9 , 7 2 o 

17 

31,065,037,602,817 x c 39>444 

18 

13,631,489 x c 78j9 06 

19 

70,525,124,609 x 646,730,219,521 x ci 57 , 80 4 

20 

c 315,653 

21 

4,485,296,422,913 x c 93 i j294 

22 

Cl, 262, 611 


4 . In 1910, Robert D. Carmichael gave the first examples of Carmichael numbers. The 
first 16 Carmichael numbers are 


561 = 3 • 11 • 17 
2,465 = 5-17-29 
8,911 = 7-19-67 
29,341 = 13-37-61 
52,633 = 7-73-103 
75,361 = 11-17-31 


1,105 = 5-13-17 
2,821 = 7-13-31 
10,585 = 5-29-73 
41,041 = 7-11-13-41 
62,745 = 3-5-47-89 


1,729 = 7-13-19 
6,601 = 7-23-41 
15,841 = 7-31-73 
46,657 = 13-37-97 
63,973 = 7-13-19-37 


5 . If n is a Carmichael number, then n is the product of at least three distinct odd 
primes with the property that if q is one of these primes, then q—1 divides n— 1. 

6 . There are a finite number of Carmichael numbers that are the product of exactly r 
primes with the first r — 2 primes specified. 

7 . If m is a positive integer such that 6 m + 1, 12m + 1, and 18m + 1 are all primes, 
then (6m + l)(12m + l)(18m + 1) is a Carmichael number. 

8. In 1994, W. R. Alford (born 1937), Andrew Granville (born 1962), and Carl Pomer- 
ance (born 1944) showed that there are infinitely many Carmichael numbers. 
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Algorithm3: Strong probable prime test (to a random base). 

input: positive numbers n, d, s, with d odd and n = 2 s d + 1. 
b := a random integer such that 1 < b < n 
c := b d mod n 

if c = 1 or c = n — 1, then declare n a probable prime and stop 
compute sequentially c 2 mod n, c 4 mod n, . . . , c 2 mod n 
if one of these is n — 1, then declare n a probable prime and stop 
else declare n composite and stop 


9 . There are infinitely many numbers that are simultaneously strong pseudoprimes to 
each base in any given finite set. Each odd composite n, however, can be a strong 
pseudoprime to at most one- fourth of the bases b with 1 < b < n — 1. 

10 . J. L. Selfridge (born 1927) suggested Algorithm 3 (often referred to as the Miller- 
Rabin test). 

11 . A “probable prime” is not necessarily a prime, but the chances are good. The 
probability that an odd composite is not declared composite by Algorithm 3 is at most | , 
so the probability it passes k independent iterations is at most 4 -fe . Suppose this test is 
applied to random odd inputs n with the hope of finding a prime. That is, random odd 
numbers n (chosen between two consecutive powers of 2) are tested until one is found 
that passes each of k independent iterations of the test. Ronald Burthe showed in 1995 
that the probability that the output of this procedure is composite is less than 4~ k . 

12 . Gary Miller proved in 1976 that if the extended Riemann hypothesis (§4.4.2) is 
true, then every odd composite n has a witness less than clog 2 n, for some constant c. 
Eric Bach showed in 1985 that one may take c = 2. Therefore, if an odd number n > 1 
passes the strong probable prime test for every base b less than 2 log 2 n, and if the 
extended Riemann hypothesis is true, then n is prime. 

13 . In practice, one can test whether numbers under 2.5 x 10 10 are prime by a small 
number of strong probable prime tests. Pomerance, Selfridge, and Samuel Wagstaff have 
verified (1980) that there are no numbers less than this bound that are simultaneously 
strong pseudoprimes to the bases 2, 3, 5, 7, and 11. Thus, any number less than 
2.5 x 10 10 that passes those strong pseudoprime tests is a prime. 

14 . Gerhard Jaeschke showed in 1993 that the test described in Fact 13 works almost 
100 times beyond 2.5 x 10 10 ; the first number for which it fails is 2,152,302,898,747. 

15 . Only primes pass the strong pseudoprime tests to all the bases 2, 3, 5, 7, 11, 13, 
and 17 until the composite number 341,550,071,728,321 is reached. 

16 . While pseudoprimality tests are usually quite efficient at recognizing composites, 
the task of proving that a number is prime can be more difficult. 

17 . In 1983, Leonard Adleman, Carl Pomerance, and Robert Rumely developed the 
APR algorithm , which can prove that a number n is prime in time proportional to 
(logn) clogloglog71 , where c is a positive constant. See [Co93] and [CrPo99] for details. 

18 . Recently, Oliver Atkin and Frangois Morain developed an algorithm to prove pri- 
mality. It is difficult to predict in advance how long it will take, but in practice it has 
been fast. One advantage of their algorithm is that, unlike APR, it produces a poly- 
nomial time primality proof, though the running time to find the proof may be a bit 
longer. An implementation called ECPP ( elliptic curve primality proving ) is available 
via ftp from 

ftp . inria. fr 
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Algorithm 1: Trial division. 

input: an integer n > 2 

output: j (smallest prime factor of n) or statement that n is prime 
3 : = 2 

while j < -y/n 
begin 

if j\ n then print that j a prime factor of n and stop {n is not prime} 
3 -=3 + 1 

end 

if no factor is found then declare n prime 


19 . In 1986, Adleman and Ming-Deh A. Huang showed that there is a test for primality 
that can be executed in random polynomial time. The test, however, is not practical. 

20 . In 1987, Carl Pomerance showed that every prime p has a primality proof whose 
verification involves just clogp multiplications with integers the size of p. It may be 
difficult, however, to find such a short primality proof. 

21 . In 1995, Sergei Konyagin and Carl Pomerance gave a deterministic polynomial time 
algorithm which, for each fixed e > 0 and all sufficiently large x, succeeds in proving 
prime at least a; 1_e prime inputs below x. The degree of the polynomial in the time 
bound depends on the choice of e. 


4.5 FACTORIZATION 

Determining the prime factorization of positive integers is a question that has been stud- 
ied for many years. Furthermore, in the past two decades, this question has become 
relevant for an extremely important application, the security of public key cryptosys- 
tems. The question of exactly how to decompose a composite number into the product 
of its prime factors is a difficult one that continues to be the subject of much research. 


4.5.1 FACTORIZATION ALGORITHMS 

Definition: 

A smooth number is an integer all of whose prime divisors are small. 

Facts: 

1. The simplest algorithm for factoring an integer is trial division, Algorithm 1. While 
simple, this algorithm is useful only for numbers that have a fairly small prime factor. 
It can be modified so that after j = 3, the number j is incremented by 2, and there are 
other improvements of this kind. 

2. Currently, the fastest algorithm for numbers that are feasible to factor but do not 
have a small prime factor is the quadratic sieve (QS), Algorithm 2, invented by Carl 
Pomerance in 1981. (For numbers at the far range of feasibility, the number Geld sieve 
is faster; see Fact 9.) 
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Algorithm 2: Quadratic sieve. 

input: n (an odd composite number that is not a power) 
output: g (a nontrivial factor of n) 

find oi, . . . , a*; such that each a 2 — n is smooth 

find a subset of the numbers a 2 — n whose product is a square, say x 2 
reduce x modulo n 

y := the product of the a i used to form the square 
reduce y modulo n 

{This gives a congruence x 2 = y 2 (mod n); equivalently n\{x 2 — y 2 ).} 
g :=gcd(x - y,n) 

if g is not a nontrivial factor then find new x and y (if necessary, find more a*) 


3. The greatest common divisor calculation may be quickly done via the Euclidean 
algorithm. If x / = if: (mod n), then g will be a nontrivial factor of n. (Among all 
solutions to the congruence x 2 = y 2 (mod n) with xy coprime to n, at least half of 
them lead to a nontrivial factorization of n.) Finding the a^s is at the heart of the 
algorithm and is accomplished using a sieve not unlike the sieve of Eratosthenes, but 
applied to the consecutive values of the quadratic polynomial a 2 — n. If a is chosen near 
y/n, then a 2 — n will be relatively small, and thus more likely to be smooth. So one 
sieves the polynomial a 2 — n, where a runs over integers near © n , for values that are 
smooth. When enough smooth values are collected, the subset with product a square 
may be found via a linear algebra subroutine applied to a matrix formed out of the 
exponents in the prime factorizations of the smooth values. The linear algebra may be 
done modulo 2. 

4. The current formulation of QS involves many improvements, the most notable of 
them the multiple polynomial variation of James Davis and Peter Montgomery. 

5. In 1994, QS was used to factor a 129-digit composite that was the product of a 
64-digit prime and a 65-digit prime. This number had been proposed as a challenge to 
those who would try to crack the famous RSA cryptosystem. 

6. In 1985, Hendrik W. Lenstra, Jr. (born 1949) invented the elliptic curve method 
(ECM), which has the advantage that, like trial division, the running time is based on 
the size of the smallest prime factor. Thus, it can be used to find comparatively small 
factors of numbers whose size would be prohibitively large for the quadratic sieve. It can 
be best understood by first examining the p— 1 method of John Pollard, Algorithm 3. 

7. The Pollard algorithm (Algorithm 3) is successful and efficient if p— 1 happens to 
be smooth for some prime p\n. If the prime factors p of n have the property that p— 1 
is not smooth, Algorithm 3 will eventually be successful if a high enough bound B is 
chosen, but in this case it will not be any more efficient than trial division, Algorithm 1. 
ECM gets around this restriction on the numbers that can be efficiently factored by 
randomly searching through various mathematical objects called elliptic curve groups , 
each of which has p+l—a elements, where |o| < 2 ©p and a depends on the curve. ECM 
is successful when a group is encountered such that p+l—a is a smooth number. 

8. As of 1998, prime factors as large as 49 digits have been found using ECM. (After 
such a factor is discovered it may turn out that the remaining part of the number is a 
prime and the factorization is now complete. This last prime may be very large, as with 
the tenth and eleventh Fermat numbers — see Table 4. In such cases the success of 
ECM is measured by the second largest prime factor in the prime factorization, though 
in some sense the method has discovered the largest prime factor as well.) 
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Algorithm3: p - 1 factorization method. 

input: n (composite number), B (a bound) 
output: a nontrivial factor of n 

b:= 2 

{loop on b} 

if & | n then stop {6 is a prime factor of n} 

M := 1 

while M < B 
begin 

g := gcd ( 6 ^ cm ( 1,2 ’"'’ M ) — l,n) 

if n > g > 1 then output g and stop {g is a nontrivial factor of n} 
else if g = n then choose first prime larger than b and go to beginning of 
the 6 -loop 
else M := M + 1 

end 


9 . The number field sieve (NFS), originally suggested by Pollard for numbers of special 
form, and developed for general composite numbers by Joseph Buhler, Lenstra, and 
Pomerance, is currently the fastest factoring algorithm for very large numbers with no 
small prime factors. 

10 . The number field sieve is similar to QS in that one attempts to assemble two 
squares x 2 and y 2 whose difference is a multiple of n, and this is done via a sieve and 
linear algebra modulo 2. However, NFS is much more complicated than QS. Although 
faster for very large numbers, the complexity of the method makes it unsuitable for 
numbers much smaller than 100 digits. The exact crossover with QS depends a great 
deal on the implementations and the hardware employed. The two are roughly within 
an order of magnitude of each other for numbers between 100 and 150 digits, with QS 
having the edge at the lower end and NFS the edge at the upper end. 

11 . Part of the NFS algorithm requires expressing a small multiple of the number to be 
factored by a polynomial of moderate degree. The running time depends, in part, on the 
size of the coefficients of this polynomial. For Cunningham numbers, this polynomial can 
be easy to find. (For example, in the notation of §4.4.2, 8 F 9 = 8(2 2 +1) = /( 2 103 ), where 
/( x) = a ; 5 + 8 .) This version is called the special number field sieve (SNFS). The version 
for general numbers, the general number field sieve (GNFS), has somewhat greater 
complexity. The greatest success of SNFS has been the factorization of a 180-digit 
Cunningham number, while the greatest success of GNFS has been the factorization of 
a 130-digit number of no special form and with no small prime factor. 

12 . See [Co93], [CrPo99], [Po90], and [Po94] for fuller descriptions of the factoring 
algorithms described here, as well as others, including the continued fraction (CFRAC) 
method. Until the advent of QS, this had been the fastest known practical algorithm. 

13 . The factorization algorithms QS, ECM, SNFS, and GNFS are fast in practice, but 
analyses of their running times depend on heuristic arguments and unproved hypotheses. 
The fastest algorithm whose running time has been rigorously analyzed is the class group 
relations method (CGR.M). It, however, is not practical. It is a probabilistic algorithm 

whose expected running time is bounded by e 0 ^ 108 ” 108 logrl , where c tends to 1 as n 
tends to infinity through the odd composite numbers that are not powers. This result 
was proved in 1992 by Lenstra and Pomerance. 
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Table 1 Comparison of various factoring methods. 


algorithm 

year 

greatest 

running 

rigorously 

Introduced 

success 

time 

analyzed 

trial division 

antiquity 

- 

y/n 

yes 

CFRAC 

1970 

63-digit number 


no 

P~ 1 

1974 

32-digit factor 

- 

yes 

QS 

1981 

129-digit number 

L(b 1) 

no 

ECM 

1985 

47-digit factor 


no 

SNFS 

1988 

180-digit number 


no 

CGRM 

1992 

- 

L(b 1) 

yes 

GNFS 

1993 

130-digit number 

T ( 1 3 / 64 \ 

^3’ V 9 ) 

no 


14 . These algorithms are summarized in Table 1 . L(a : b) means that the running time 
to factor n is bounded by e c ( log n ) i log log , where c tends to b as n tends to infinity 
through the odd composite non-powers. Running times are measured in the number of 
arithmetic steps with integers at most the size of n. 

15 . The running time for Trial Division in Table 1 is a worst case estimate, achieved 
when n is prime or the product of two primes of the same magnitude. When n is 
composite, Trial Division will discover the least prime factor p of n in roughly p steps. 
The record for the largest prime factor discovered via Trial Division is not known, nor is 
the largest number proved prime by this method, though the feat of Euler of proving that 
the Mersenne number 2 31 — 1 is prime, using only Trial Division and hand calculations, 
should certainly be noted. (Euler surely knew, though, that any prime factor of 2 31 — 1 
is 1 mod 31, so only 1 out of every 31 trial divisors needed to be tested.) 

16 . The running time of the p — 1 method is about B , where B is the least number 
such that for some prime factor p of n, p — 1 divides 1cm (1,2,..., B). 

17 . There are variants of CFRAC and GNFS that have smaller heuristic complexity 
estimates, but the ones in the table above are for the fastest practical version. 

18 . The running time bound for ECM is a worst case estimate. It is more appropriate 
to measure ECM as a function of the least prime factor p of n. This heuristic complexity 
bound is e c '\/ logplog logp , where c tends to \/2 as p tends to infinity. 

19 . Table 2 was compiled with the assistance of Samuel Wagstaff. It should be re- 
marked that there is no firm definition of a “hard number” . What is meant here is that 
the number was factored by an algorithm that is not sensitive to any particular form 
the number may have, nor sensitive to the size of the prime factors. 

20 . It is unknown whether there is a polynomial time factorization algorithm. Whether 
there are any factorization algorithms that surpass the quadratic sieve, the elliptic curve 
method, and the number field sieve in their respective regions of superiority is an area 
of much current research. 

21. A cooperative effort to factor large numbers called NFSNet has been set up. It 
can be found on the Internet at 

http : // www . dataplex . net/NFSNet 
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Table 2 Largest hard number factored as a function of time. 


year 

method 

digits 

1970 

CFRAC 

39 

1979 

CFRAC 

46 

1982 

CFRAC 

54 

1983 

QS 

67 

1986 

QS 

87 

1988 

QS 

102 

1990 

QS 

116 

1994 

QS 

129 

1995 

GNFS 

130 


22. A subjective measurement of progress in factorization can be gained by looking at 
the “ten most wanted numbers” to be factored. The list is maintained by Sam Wagstaff 
and can be found at http://www.cs.purdue.edu/homes/ssw/cun/index.html. As of 
May 1999, “number one” on this list is 2 617 — 1. 


4.6 ARITHMETIC FUNCTIONS 


Functions whose domains are the set of positive integers play an important role in 
number theory. Such functions are called arithmetic functions and are the subject of 
this section. The information presented here includes definitions and properties of many 
important arithmetic functions, asymptotic estimates on the growth of these functions, 
and algebraic properties of sets of certain arithmetic functions. For more information 
on the topics covered in this section see [Ap76]. 


4.6.1 MULTIPLICATIVE AND ADDITIVE FUNCTIONS 
Definitions: 

An arithmetic function is a function that is defined for all positive integers. 

An arithmetic function is multiplicative if f(mn) = whenever m and n are 

relatively prime positive integers. 

An arithmetic function is completely multiplicative if f(mn) = f(m)f(n) for all 
positive integers m and n. 

If / is an arithmetic function, then ^ d | n /(d), the value of the summatory function 
of / at n , is the sum of f(d) over all positive integers d that divide n. 

An arithmetic function / is additive if f(mn) = f(m) + f(n) whenever m and n are 
relatively prime positive integers. 

An arithmetic function / is completely additive if /(to, n ) = f(m) + f(n) whenever to 
and n are positive integers. 
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Facts: 

1. If / is a multiplicative function and n = p^p?? ■ ■ -Ps s is the prime-power factoriza- 
tion of n, then /(n) = f(pT)f(P?) ■ ■ ■ f(Ps s )- 

2. If / is multiplicative, then /( 1) = 1. 

3. If / is a completely multiplicative function and n = p°^p^ then f(n) = 

f(Pi) ai f(P2) a2 ...f(p s ) aa - 

4. If / is multiplicative, then the arithmetic function F(n) — J2d\n f(d) multiplica- 
tive. 

5. If / is an additive function, then /( 1) = 0. 

6. If / is an additive function and a is a positive real number, then F(n) = a K”) is 
multiplicative. 

7. If / is a completely additive function and a is a positive real number, then F(n) = 
a F n ) j s completely multiplicative. 

Examples: 

1. The function f(n) = n 2 is multiplicative. Even more, it is completely multiplicative. 

2. The function I{n) = |_^J (so that /(l) = 1 and J(n) = 0 if n is a positive integer 
greater than 1) is completely multiplicative. 

3. The Euler phi-function, the number of divisors function, the sum of divisors function, 
and the Mobius function are all multiplicative. None of these functions is completely 
multiplicative. 


4.6.2 EULER S PHI-FUNCTION 
Definition: 

If n is a positive integer then <j>(n), the value of the Euler-phi function at n, is the 
number of positive integers not exceeding n that are relatively prime to n. The Euler-phi 
function is also known as the totient function. 

Facts: 

1. The Euler </> function is multiplicative, but not completely multiplicative. 

2. If p is a prime, then (f>(p) = p — 1. 

3. If p is a positive integer with <p(p) = p — 1, then p is prime. 

4. If p is a prime and a is a positive integer, then <fi(p a ) = p a — p a_1 . 

5. If n is a positive integer with prime-power factorization n = p± p ^ ...p^ k , then 

6. If n is a positive integer greater than 2, then <p(n) is even. 

7. If n has r distinct odd prime factors, then 2 r divides <f>(n ). 

8. If m and n are positive integers and gcd(m,n) = d, then cf>(mn ) = ^ ^ ■ 

9. If m and n are positive integers and m|n, then <p(m)\<p(n) . 

10. If n is a positive integer, then J2d\n = J2d\n ^( 3 ) = n - 

11. If n is a positive integer with n > 5, then (f>(n) > 6 loglog n • 

12 - £fc=i ^ + 0(n tog n) 

13. £LiT=^ + °(nlogn) 
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Examples: 

1. Table 1 includes the value of <j>(n) for 1 < n < 1000. 

2. To see that 0(10) = 4, note that the positive integers not exceeding 10 relatively 
prime to 10 are 1, 3, 7, and 9. 

3. To find 0(720), note that 0(720) = 0(2 4 3 2 5) = 720(1 - |)(1 - |)(1 - ±) = 192. 


4.6.3 SUM AND NUMBER OF DIVISORS FUNCTIONS 
Definitions: 

If n is a positive integer, then cr(n), the value of the sum of divisors function at n, 
is the sum of the positive integer divisors of n. 

A positive integer n is perfect if and only if it equals the sum of its proper divisors (or 
equivalently, if a{n) = 2 n). 

A positive integer n is abundant if the sum of the proper divisors of n exceeds n (or 
equivalently, if cr{n) > 2 n). 

A positive integer n is deficient if the sum of the proper divisors of n is less than n (or 
equivalently, if a(n) < 2 n). 

The positive integers m and n are amicable if a (m) = a(n ) = m + n. 

If 7i is a positive integer, then r(n), the value of the number of divisors function 
at n, is the number of positive integer divisors of n. 

Facts: 

1. The number of divisors function is multiplicative, but not completely multiplicative. 

2. The number of divisors function is the summatory function of f(n) = 1; that is, 
r M = Edln 1 - 

3. The sum of divisors function is multiplicative, but not completely multiplicative. 

4. The sum of divisors function is the summatory function of f(n) = n; that is, <r(n) = 

E d. 

d\n 

5. If n is a positive integer with prime-power factorization n = p^p^ 2 ■■■P°ki then 

^(n) = n i = 1 (p“ ,+1 -l)/(Pi-l)- 

6. If n is a positive integer with prime-power factorization n = p^p^ 2 ■■■P°ki then 

T 0) = II j=i ( a j + !)• 

7. If n is a positive integer, then r(n) is odd if and only if n is a perfect square. 

8 . If k is an integer greater than 1 , then the equation t(ji) = k has infinitely many 
solutions. 

9. If n is a positive integer, then (Ed|„ T (^)) 2 = Ed|n T ( d ) 3 - 

10. A positive integer n is an even perfect number if and only if n = 2 m-1 (2 m — 1) 
where m is an integer, m > 2, and 2 m — 1 is prime (so that it is a Mersenne prime 
(§4.4.3)). Hence, the number of known even perfect numbers equals the number of 
known Mersenne primes. 

11. It is unknown whether there are any odd perfect numbers. However, it is known 
that there are no odd perfect numbers less than lO 300 and that any odd perfect number 
must have at least eight different prime factors. 


© 2000 by CRC Press LLC 


Table 1 Values of a (n), r(n), and ^t(n) for 1 < n < 1000. 

Using Maple V, the numtheory package commands phi(n), sigma(n), tau(n), and 
mobius (n) can be used to calculate these functions. 
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n 4> a t n 

461 460 462 2-1 
466 232 702 4 1 
471 312 632 4 1 
476 192 1008 12 0 
481 432 532 4 1 
486 162 1092 12 0 
491 490 492 2-1 

496 240 992 10 0 
501 332 672 4 1 
506 220 864 8-1 
511 432 592 4 1 
516 168123212 0 
521 520 522 2-1 
526 262 792 4 1 

531 348 780 6 0 
536 264 1020 8 0 
541 540 542 2-1 
546 144 1344 16 1 

551 504 600 4 1 
556 276 980 6 0 
561 320 864 8-1 
566 282 852 4 1 
571 570 572 2-1 
576 192165121 0 
581 492 672 4 1 

586 292 882 4 1 
591 392 792 4 1 
596 296 1050 6 0 
601 600 602 2-1 
606 2001224 8-1 
611 552 672 4 1 
616 240 1440 16 0 

621 396 960 8 0 
626 312 942 4 1 

631 630 632 2-1 
636 208 151212 0 
641 640 642 2-1 

646 2881080 8-1 
651 3601024 8-1 
656 320 1302 10 0 
661 660 662 2-1 
666 216 1482 12 0 

671 600 744 4 1 
676 3121281 9 0 
681 452 912 4 1 

686 294 1200 8 0 
691 690 692 2-1 
696 224 1800 16 0 

462 120115216 1 

467 466 468 2-1 
472 232 900 8 0 
477 312 702 6 0 
482 240 726 4 1 

487 486 488 2-1 
492 160117612 0 

497 420 576 4 1 
502 250 756 4 1 
507 312 732 6 0 
512 256 1023 10 0 
517 460 576 4 1 
522 1681170 12 0 
527 480 576 4 1 
532 2161120 12 0 

537 356 720 4 1 
542 270 816 4 1 

547 546 548 2-1 
552 176 1440 16 0 
557 556 558 2-1 
562 280 846 4 1 
567 324 968 10 0 
572 240 1176 12 0 

577 576 578 2-1 
582 1921176 8-1 
587 586 588 2-1 
592 288 1178 10 0 
597 396 800 4 1 
602 2521056 8-1 
607 606 608 2-1 
612 1921638 18 0 
617 616 618 2-1 
622 310 936 4 1 
627 360 960 8-1 
632 312 1200 8 0 
637 504 798 6 0 
642 2121296 8-1 

647 646 648 2-1 
652 324 1148 6 0 
657 432 962 6 0 
662 330 996 4 1 
667 616 720 4 1 
672 192 2016 24 0 

677 676 678 2-1 
682 3001152 8-1 
687 456 920 4 1 
692 344 1218 6 0 
697 640 756 4 1 

463 462 464 2-1 
468 144 1274 18 0 
473 420 528 4 1 
478 238 720 4 1 
483 264 768 8-1 
488 240 930 8 0 
493 448 540 4 1 

498 1641008 8-1 
503 502 504 2-1 
508 252 896 6 0 
513 324 800 8 0 
518 216 912 8-1 
523 522 524 2-1 
528 160 1488 20 0 
533 480 588 4 1 
538 268 810 4 1 
543 360 728 4 1 
548 272 966 6 0 
553 468 640 4 1 
558 180 1248 12 0 
563 562 564 2-1 
568 280 1080 8 0 
573 380 768 4 1 
578 272 921 6 0 
583 520 648 4 1 
588 168 159618 0 
593 592 594 2-1 
598 2641008 8-1 
603 396 884 6 0 
608 288 126012 0 
613 612 614 2-1 
618 2041248 8-1 

623 528 720 4 1 
628 3121106 6 0 
633 420 848 4 1 
638 2801080 8-1 
643 642 644 2-1 

648 2161815 20 0 
653 652 654 2-1 
658 2761152 8-1 
663 3841008 8-1 
668 332 1176 6 0 
673 672 674 2-1 
678 2241368 8-1 
683 682 684 2-1 
688 336 136410 0 
693 360 1248 12 0 
698 348 1050 4 1 

464224 93010 0 
469 396 544 4 1 
474156 960 8-1 
479478 480 2-1 
484220 931 9 0 
489324 656 4 1 
494216 840 8-1 

499498 500 2-1 
504 1441560 24 0 

509508 510 2-1 
514256 774 4 1 

519344 696 4 1 
524260 924 6 0 
529506 553 3 0 
5341761080 8-1 
539420 684 6 0 
544256113412 0 

549360 806 6 0 
554276 834 4 1 
559504 616 4 1 
564 1841344 12 0 

569568 570 2-1 
5742401008 8-1 
579384 776 4 1 
5842881110 8 0 
589 540 640 4 1 
5941801440 16 0 
599598 600 2-1 
6043001064 6 0 
609336 960 8-1 
614306 924 4 1 

619618 620 2-1 
6241921736 20 0 
629576 684 4 1 
634316 954 4 1 
639420 936 6 0 
644264134412 0 

649580 720 4 1 
6542161320 8-1 
659658 660 2-1 
6643281260 8 0 
669444 896 4 1 
6743361014 4 1 
679576 784 4 1 
684216182018 0 
689624 756 4 1 
6943461044 4 1 

699464 936 4 1 

465 240 768 8-1 
470184 864 8-1 
475 360 620 6 0 
480128 1512 24 0 

485 384 588 4 1 
490168 102612 0 
495 240 93612 0 
500 200 1092 12 0 
505400 612 4 1 
510128129616 1 
515408 624 4 1 

5201921260 16 0 
525 240 99212 0 
530 208 972 8-1 
535424 648 4 1 
540 144 1680 24 0 

545432 660 4 1 
550 200111612 0 
555288 912 8-1 
560192 1488 20 0 
565448 684 4 1 
570 144 1440 16 1 

575440 744 6 0 
580 224126012 0 
585 288 1092 12 0 
590 2321080 8-1 
595 384 864 8-1 
600160 1860 24 0 
605440 798 6 0 
6102401116 8-1 
615 3201008 8-1 
620 240 134412 0 

625 500 781 5 0 
630 144 1872 24 0 

635 504 768 4 1 
640 256 1530 16 0 
645 3361056 8-1 
650 240 1302 12 0 
655 520 792 4 1 
660160 2016 24 0 
665432 960 8-1 
670 2641224 8-1 

675 360 124012 0 
680 2561620 16 0 
685 544 828 4 1 
690 176 1728 16 1 
695 552 840 4 1 
700 240 1736 18 0 
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n (j) a t /.t 

n <f) er r /i 

n (f) a T A 4 

n (f> a t /-i 

n 4> cf t /-i 

701 700 702 2-1 
706 3521062 4 1 
711 4681040 6 0 
716 3561260 6 0 
721 612 832 4 1 

726 220159612 0 
731 672 792 4 1 
736 352151212 0 
741 4321120 8-1 
746 3721122 4 1 

751 750 752 2-1 
756 216 2240 24 0 

761 760 762 2-1 
766 3821152 4 1 
771 5121032 4 1 

776 384 1470 8 0 
781 700 864 4 1 
786 2601584 8-1 
791 672 912 4 1 
796 396 1400 6 0 
801 5281170 6 0 
806 3601344 8-1 
811 810 812 2-1 

816 256 2232 20 0 
821 820 822 2-1 

826 3481440 8-1 
831 5521112 4 1 

836 360168012 0 
841 812 871 3 0 
846 276187212 0 
851 792 912 4 1 

856 4241620 8 0 
861 4801344 8-1 
866 4321302 4 1 
871 792 952 4 1 
876 288 207212 0 
881 880 882 2-1 
886 4421332 4 1 

891 540145210 0 
896 384204016 0 
901 832 972 4 1 
906 3001824 8-1 
911 910 912 2-1 

916 4561610 6 0 
921 6121232 4 1 
926 4621392 4 1 
931 7561140 6 0 
936 288 2730 24 0 
941 940 942 2-1 

946 4201584 8-1 

702 216168016 0 
707 600 816 4 1 
712 3521350 8 0 
717 476 960 4 1 
722 3421143 6 0 

727 726 728 2-1 
732 240 1736 12 0 
737 660 816 4 1 
742 3121296 8-1 

747 492 1092 6 0 
752 368 1488 10 0 
757 756 758 2-1 
762 2521536 8-1 
767 696 840 4 1 
772 384 1358 6 0 
777 4321216 8-1 
782 3521296 8-1 
787 786 788 2-1 
792 240 2340 24 0 
797 796 798 2-1 
802 400 1206 4 1 
807 536 1080 4 1 
812 336168012 0 
817 756 880 4 1 
822 2721656 8-1 
827 826 828 2-1 
832 3841778 14 0 
837 540 1280 8 0 
842 420 1266 4 1 

847 660 1064 6 0 
852 280 201612 0 
857 856 858 2-1 
862 430 1296 4 1 
867 544 1228 6 0 
872 4321650 8 0 
877 876 878 2-1 
882 252 222318 0 

887 886 888 2-1 
892 444 1568 6 0 
897 5281344 8-1 
902 4001512 8-1 
907 906 908 2-1 
912 288 2480 20 0 
917 7801056 4 1 
922 460 1386 4 1 
927 6121352 6 0 
932 464 1638 6 0 
937 936 938 2-1 
942 3121896 8-1 
947 946 948 2-1 

703 648 760 4 1 
708 232 1680 12 0 
713 660 768 4 1 
718 358 1080 4 1 
723 480 968 4 1 
728 288 1680 16 0 
733 732 734 2-1 
738 240 1638 12 0 
743 742 744 2-1 
748 320 1512 12 0 

753 500 1008 4 1 
758 3781140 4 1 
763 648 880 4 1 
768 256 2044 18 0 
773 772 774 2-1 
778 3881170 4 1 
783 504 1200 8 0 
788 392 1386 6 0 
793 720 868 4 1 
798 216 1920 16 1 
803 720 888 4 1 
808 400 1530 8 0 
813 540 1088 4 1 
818 408 1230 4 1 
823 822 824 2-1 
828 264 218418 0 

833 672 1026 6 0 
838 418 1260 4 1 
843 5601128 4 1 
848 416 1674 10 0 
853 852 854 2-1 
858 240 201616 1 
863 862 864 2-1 
868 360 1792 12 0 
873 576 1274 6 0 
878 438 1320 4 1 
883 882 884 2-1 
888 288 228016 0 
893 828 960 4 1 
898 448 1350 4 1 
903 5041408 8-1 
908 452 1596 6 0 
913 820 1008 4 1 
918 288 216016 0 
923 840 1008 4 1 
928 448 1890 12 0 
933 620 1248 4 1 
938 3961632 8-1 
943 880 1008 4 1 
948 312 2240 12 0 

704 3201524 14 0 

709 708 710 2-1 
714 192 1728 16 1 
719 718 720 2-1 
724 360 1274 6 0 
729 4861093 7 0 
734 3661104 4 1 
739 738 740 2-1 
744 240 1920 16 0 

749 636 864 4 1 
754 3361260 8-1 
759 4401152 8-1 
764 3801344 6 0 
769 768 770 2-1 
774 252171612 0 

779 720 840 4 1 
784 336176715 0 
789 524 1056 4 1 
794 3961194 4 1 
799 736 864 4 1 
804 264190412 0 

809 808 810 2-1 
814 3601368 8-1 
819 432 145612 0 
8244081560 8 0 
829 828 830 2-1 
834 2761680 8-1 
839 838 840 2-1 
8444201484 6 0 

849 5641136 4 1 
854 3601488 8-1 
859 858 860 2-1 
864 288 2520 24 0 
869 780 960 4 1 
874 3961440 8-1 
879 5841176 4 1 
884 384176412 0 
889 756 1024 4 1 
894 2961800 8-1 
899 840 960 4 1 
9044481710 8 0 
909 6001326 6 0 
9144561374 4 1 

919 918 920 2-1 
924 240 2688 24 0 

929 928 930 2-1 
9344661404 4 1 

939 624 1256 4 1 
944464186010 0 
949 864 1036 4 1 

705 3681152 8-1 
710 2801296 8-1 
715 4801008 8-1 
720192 2418 30 0 
725 560 930 6 0 
730 2881332 8-1 
735 336136812 0 
740 288 1596 12 0 
745 592 900 4 1 
750 200187216 0 
755 600 912 4 1 
760 288180016 0 
765 384 1404 12 0 
770 240 1728 16 1 
775 600 992 6 0 
780 192 2352 24 0 
785 624 948 4 1 
790 3121440 8-1 
795 4161296 8-1 
800 320195318 0 
805 5281152 8-1 
810 216 2178 20 0 
815 648 984 4 1 
820 320 1764 12 0 
825 400 148812 0 
830 3281512 8-1 
835 664 1008 4 1 
840 192 2880 32 0 
845 624 1098 6 0 
850 320 1674 12 0 
855 432156012 0 
860 336184812 0 
865 688 1044 4 1 
870 224 216016 1 

875 600 1248 8 0 
880 320 2232 20 0 
885 4641440 8-1 
890 3521620 8-1 
895 7121080 4 1 
900 240 2821 27 0 
905 720 1092 4 1 
910 288 201616 1 
915 4801488 8-1 
920 352 216016 0 
925 7201178 6 0 
930 240 230416 1 
935 6401296 8-1 
940 368 201612 0 
945 432192016 0 
950 360186012 0 
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n <f> 0 t /i 

n <f> 0 t /i 

n <f> 0 t /i 

n <f> (7 T pL 

n (f> ( 7 t p 

9516321272 41 

9564761680 6 0 
961930 993 3 0 
966 2642304161 
971970 972 2-1 
976480192210 0 
9816481430 6 0 
9864481620 8-1 
991990 992 2-1 
996 328 235212 0 

952 384216016 0 
957 5601440 8-1 
962 4321596 8-1 
967 966 968 2-1 
972 324254818 0 
977 976 978 2-1 
982 490 1476 4 1 
987 5521536 8-1 
992 480 201612 0 
997 996 998 2-1 

953 952 954 2-1 
958 478 1440 4 1 
963 636 1404 6 0 
968 440 1995 12 0 
973 8281120 4 1 
978 3241968 8-1 
983 982 984 2-1 
988 432196012 0 
993 6601328 4 1 
998 498 1500 4 1 

954312210612 0 

959 8161104 4 1 
9644801694 6 0 
969 5761440 8-1 
9744861464 4 1 

979 8801080 4 1 
984320 252016 0 
989 9241056 4 1 
9944201728 8-1 
999 6481520 8 0 

955 7601152 4 1 
960 256 3048 28 0 
965 7681164 4 1 
970 3841764 8-1 
975 480173612 0 
980 336 2394 18 0 
985 7841188 4 1 
990 240 2808 24 0 
995 792 1200 4 1 
1000 400 234016 0 


12 - ELi a ( k ) = + °( n lo s n ) 

13 . Efc= i T (^’) = n l°g n + ( 2 7 — l) n + 0(y/n), where 7 is Euler’s constant. 

14 . If m and n are amicable, then to is the sum of the proper divisors of n, and vice 
versa. 

Examples: 

1. Table 1 lists the values of cr(n) and r(n) for 1 < n < 1000. 

2. To find t(720), note that r(720) = r(2 4 • 3 2 • 5) = (4 + 1)(2 + 1)(1 + 1) = 30. 

3 . To find cr(200) note that cr(200) = cr(2 3 5 2 ) = ^ ^ = 15 • 31 = 465. 

4 . The integers 6 and 28 are perfect; the integers 9 and 16 are deficient; the integers 12 
and 945 are abundant. 

5 . The integers 220 and 284 form the smallest pair of amicable numbers. 


4.6.4 THE MOBIUS FUNCTION AND OTHER IMPORTANT ARITHMETIC FUNCTIONS 
Definitions: 

If n is a positive integer, p,(n), the value of the Mobius function, is defined by: 

( 1, if n = 1 

/i(n) = < 0, if n has a square factor larger than 1 

I (— l) s , if n is squarefree and is the product of s different primes. 

If n > 1 is a positive integer, with prime-power factorization p^p^ 2 ■ ■ ■ p then A (n), 
the value of Liouville’s function at n, is given by A (n) = (— l)°i+ a 2H with 

A(l) = 1- 

If n is a positive integer with prime-power factorization n = pi ai P2 a2 ■■■ Pm am 1 then 
the arithmetic functions f 1 and u> are defined by fl(l) = w(l) = 0 and for n > 1, 

^( n ) = Y^ILi a i an( l w(n) = m. That is, f l(n) is the sum of the exponents in the prime- 

power factorization of n and aj(n) is the number of distinct primes in the prime-power 
factorization of n. 

Facts: 

1. The Mobius function is multiplicative, but not completely multiplicative. 

2 . Mobius inversion formula: If / is an arithmetic function and F(n) = 
then f(n) = E d |n ^ d ) F ^)- 
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3. 

4. 

5. 

6 . 

7. 


If n is a positive integer, then = Ed|nM^)§- 

If / is multiplicative, then Ed|„M d )/(«0 = IIpiJ 1 ~ fip))- 

If / is multiplicative, then Ed| n v( d ) 2 f( d ) = rip|„( 1 + /(p))- 


If n is positive integer then EdlnM^) = 
If n is a positive integer, then Ed|n 


f 1 if n = 1; 
l 0 if n > 1. 

J 1 if n is a perfect square; 

( 0 if n is not a perfect square. 


8. In 1897 Mertens showed that |Efc=iM^)l < V™ f° r positive integers n not 
exceeding 10,000 and conjectured that this inequality holds for all positive integers n. 
However, in 1985 Odlyzko and teRiele disproved this conjecture, which went by the 
name Mertens’ conjecture without giving an explicit integer n for which the conjecture 
fails. In 1987 Pintz showed that there is at least one counterexample n with n < 10 65 , 
again without giving an explicit counterexample n. Finding such an integer n requires 
more computing power than is currently available. 


9. Liouville’s function is completely multiplicative. 

10. The function u> is additive and the function fi is completely additive. 


Examples: 

1. /z(12) = 0 since 2 2 |12 and /z(105) = /z(3 • 5 • 7) = (-1) 3 = -1. 

2. A(720) = A(2 4 • 3 2 • 5) = (-l) 4+2+1 = (-1) 7 = -1. 

3. 0(720) = fi( 2 4 • 3 2 • 5) = 4 + 2 + 1 = 7 and w(720) = w(2 4 • 3 2 • 5) = 3. 


4.6.5 DIRICHLET PRODUCTS 

Definitions: 

If / and g are arithmetic functions, then the Dirichlet product of / and g is the 
function f*g defined by (/ * g)(n) = Ed| „/(<%(?)• 

If / and g are arithmetic functions such that f*g = g-kf = I, where I(n ) = [^J, then g 
is the Dirichlet inverse of /. 

Facts: 

1. If / and g are arithmetic functions, then / * g = g * /. 

2. If /, g, and h are arithmetic functions, then (/*<?)* h = / * (g * h). 

3. If /, g , and h are arithmetic functions, then / * (g + h) = (f * g) + (f * h). 

4. Because of Facts 1-3, the set of arithmetic functions with the operations of Dirichlet 
product and ordinary addition of functions forms a ring. (See Chapter 5.) 

5. If / is an arithmetic function with /( 1) ^ 0, then there is a unique Dirichlet inverse 

of /, which is written as / . Furthermore, / _1 is given by the recursive formulas 

= th) and = for n>1 - 

6. The set of all arithmetic functions / with /( 1) 0 forms an abelian group with 

respect to the operation ★, where the identity element is the function I. 

7. If / and g are arithmetic functions with /( 1) 0 and g(l) y^ 0, then (/ * g)~ x = 

/ _1 *s -1 - 
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8. If u is the arithmetic function with u(n) = 1 for all positive integers n, then p-ku = I, 
so it = p~ 1 and p = u _1 . 

9. If / is a multiplicative function, then / is completely multiplicative if and only if 
/ _1 (n) = p(n)f(n) for all positive integers n. 

10. If / and g are multiplicative functions, then f *g is also multiplicative. 

11. If / and g are arithmetic functions and both / and f*g are multiplicative, then g 
is also multiplicative. 

12. If / is multiplicative, then / -1 exists and is multiplicative. 


Examples: 

1. The identity <j>(n) = Yld\n i (§4-6.4 Fact 3) implies that <j> = p*N where N is 
the multiplicative function N(n) = n. 

2 . Since the function N is completely multiplicative, IV -1 = pN by Fact 9. 

3. From Example 1 and Facts 7 and 8, it follows that 0 _1 = p _1 kpN = p*pN. Hence 
<t>~ l {n) = Hd\n d ^ d )- 


4.7 PRIMITIVE ROOTS AND QUADRATIC RESIDUES 

A primitive root of an integer, when it exists, is an integer whose powers run through 
a complete system of residues modulo this integer. When a primitive root exists, it is 
possible to use the theory of indices to solve certain congruences. This section provides 
the information needed to understand and employ primitive roots. 

The question of which integers are perfect squares modulo a prime is one that 
has been studied extensively. An integer that is a perfect square modulo n is called 
a quadratic residue of n. The law of quadratic reciprocity provides a surprising link 
between the answer to the question of whether a prime p is a perfect square modulo a 
prime q and the answer to the question of whether q is a perfect square modulo p. This 
section provides information that helps determine whether an integer is a quadratic 
residue modulo a given integer n. 

There are important applications of the topics covered in this section, including 
applications to public key cryptography and authentication schemes. (See Chapter 14.) 


4.7.1 PRIMITIVE ROOTS 
Definitions: 

If a and m are relatively prime positive integers, then the order of a modulo m , 
denoted ord m a, is the least positive integer x such that a x = 1 (mod m). 

If r and n are relatively prime integers and n is positive, then r is a primitive root 
modulo m if ord n r = 4>{n). A primitive root modulo m is also said to be a primitive 
root of m and m is said to have a primitive root. 

If m is a positive integer, then the minimum universal exponent modulo m is the 
smallest positive integer A (m) for which a x ^ m ’ = 1 (mod m) for all integers a relatively 
prime to m. 


© 2000 by CRC Press LLC 



Facts: 


1. The positive integer n, with n > 1, has a primitive root if and only if n = 2,4,p ( 
or 2 p* where p is an odd prime and t is a positive integer. 

2 . There are 4>{d) incongruent integers modulo p if p is prime and d is a positive divisor 
of p — 1. 

3 . There are <p(p — 1) primitive roots of p if p is a prime. 

4 . If the positive integer m has a primitive root, then it has a total of incon- 

gruent primitive roots. 

5 . If r is a primitive root of the odd prime p, then either r or r + p is a primitive root 
modulo p 2 . 

6. If r is a primitive root of p 2 , where p is prime, then r is a primitive root of p k for 
all positive integers k. 

7. It is an unsettled conjecture (stated by E. Artin) whether 2 is a primitive root of 
infinitely many primes. More generally, given any prime p it is unknown whether p is a 
primitive root of infinitely many primes. 

8. It is known that given any three primes, at least one of these primes is a primitive 
root of infinitely many primes. [GuMu84] 

9. Given a set of n primes, pi,P 2 , ■ ■ ■ ,Pn, there are rife=i ^{Pk — 1) integers x with 
1 < x < nl-=iPfc such that x is a primitive root of pk for k = 1,2 , ...,n. Such an 
integer x is a called a common primitive root of the primes pi, . . . ,p n . 

10 . Let g p denote the smallest positive integer that is a primitive root modulo p where p 
is a prime. It is known that g p is not always small; in particular it has been shown by 
Fridlender and Salie ( [Ri96] ) that there is a positive constant C such that g p > C log p 
for infinitely many primes p. 

11. Burgess has shown that g p does not grow too rapidly; in particular he showed that 
g P < Cp^ +t for e > 0, C a constant, p sufficiently large. [Ri96] 

12 . The minimum universal exponent modulo the powers of 2 are: A(2) = 1, A(2 2 ) = 2, 

and A(2 fc ) = 2 k ~ 2 for k = 3 , 4 , 

13 . If m is a positive integer with prime-power factorization 2 k qi ai ...q r ar where k 
is a nonnegative integer, then the least universal exponent of m is given by A (m) = 
lcm(A(2 fc ), </>(< 7 i ai ), . . . , 0O?r Or ))- 

14 . For every positive integer m, there is an integer a such that ord m a = A irn). 

15 . There are six positive integers m with A (m) = 2: m = 3,4, 6, 8, 12, 24. 

16 . Table 1 displays the least primitive root of each prime less than 10,000. 

Examples: 

1. Since 2 1 = 2, 2 2 = 4, and 2 3 = 1 (mod 7), it follows that ordy2 = 3. 

2 . The integers 2, 6, 7, and 8 form a complete set of incongruent primitive roots 
modulo 11. 

3 . The integer 10 is a primitive root of 487, but it is not a primitive root of 487 2 . 

4. There are </>(6)</>(10) = 2-4 = 8 common primitive roots of 7 and 11 between 1 and 
7 • 11 = 77. They are the integers 17, 19, 24, 40, 52, 61, 68, and 73. 

5 . From Facts 12 and 13 it follows that the minimum universal exponent of 1200 is 
A(7,200) = A(2 5 • 3 2 • 5 2 ) = lcm(2 3 , 0(3 2 ), ^(5 2 )) = lcm(8, 6, 20) = 120. 
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Table 1 Primes and primitive roots. 

For each prime p < 10,000 the least primitive root ui is given. 


p 

UJ 

P 

UJ 

P 

UJ 

P 

UJ 

P 

UJ 

P 

UJ 

P 

UJ 

P 

UJ 

P 

UJ 

3 

2 

5 

2 

7 

3 

11 

2 

13 

2 

17 

3 

19 

2 

23 

5 

29 

2 

31 

3 

37 

2 

41 

6 

43 

3 

47 

5 

53 

2 

59 

2 

61 

2 

67 

2 

71 

7 

73 

5 

79 

3 

83 

2 

89 

3 

97 

5 

101 

2 

103 

5 

107 

2 

109 

6 

113 

3 

127 

3 

131 

2 

137 

3 

139 

2 

149 

2 

151 

6 

157 

5 

163 

2 

167 

5 

173 

2 

179 

2 

181 

2 

191 19 

193 

5 

197 

2 

199 

3 

211 

2 

223 

3 

227 

2 

229 

6 

233 

3 

239 

7 

241 

7 

251 

6 

257 

3 

263 

5 

269 

2 

271 
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4.7.2 INDEX ARITHMETIC 


Definition: 

If to is a positive integer with primitive root r and a is an integer relatively prime to to, 
then the unique nonnegative integer x not exceeding 4>(m) with r x = a (mod to) is the 

index of a to the base r modulo rn. or the discrete logarithm of a to the base r 
modulo m. 

The index is denoted ind r a (where the modulus to is fixed). 


Facts: 

1. Table 2 displays, for each prime less than 100, the indices of all numbers not ex- 
ceeding the prime using the least primitive root of the prime as the base. 


Table 2 Indices for primes less than 100. 

For each prime p < 100 two tables are given. Let g be least primitive element of the 
group F* and assume g x = y. 

The table on the left has a y in position x, while the one on the right has an x in 
position y. 
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2. If m is a positive integer with primitive root r and a is a positive integer relatively 
prime to to, then a = r mdra (mod to). 

3. If to is a positive integer with primitive root r, then ind r l = 0 and ind r r = 1. 

4. If to > 2 is an integer with primitive root r, then ind r (— 1) = d v© . 
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5. If in is a positive integer with primitive root r, and a and b are integers relatively 
prime to to, then: 

• ind r l = 0 (mod <j>{m))\ 

• ind r (a&) = ind r o + ind,,& (mod (j>{m))\ 

• ind r a fc = k • ind r a (mod 4>(m)) if k is a positive integer. 

6. If to is a positive integer and r and s are both primitive roots modulo to, then 
ind r a = ind s a • ind r s (mod 4>{m)). 

7. If to is a positive integer with primitive root r, and a and b are integers both 
relatively prime to to, then the exponential congruence a x = b (mod to) has a solution 
if and only if d|ind r 6. Furthermore, if there is a solution to this exponential congruence, 
then there are exactly gcd(ind r a, </>(m)) incongruent solutions. 

8. There is a wide variety of algorithms for computing discrete logarithms, includ- 
ing those known as the baby-step, giant-step algorithm, the Pollard rho algorithm, 
the Pollig-Hellman algorithm, and the index-calculus algorithm. (See [MevaVa96] for 
details.) 

9. The fastest algorithms known for computing discrete logarithms, relative to a fixed 
primitive root, of a given prime p are index-calculus algorithms, which have subexponen- 
tial computational complexity. In particular, there is an algorithm based on the number 
field sieve that runs using L p (|, 1.923) = 0(exp((1.923 + o(l))(logp)3(loglogp)s)) bit 
operations. (See [MevaVa96].) 

10. Many cryptographic methods rely on intractability of finding discrete logarithms 
of integers relative to a fixed primitive root r of a fixed prime p. 

Examples: 

1. To solve 3* 30 = 4 (mod 37) take indices to the base 2 (2 is the smallest primitive root 
of 37) to obtain ind2(3* 30 ) = ind24 = 2 (mod 36). Since ind2(3* 30 ) = ind23+30-ind2* = 
26 + 30 • ind 2 * (mod 36), it follows that 30 • ind 2 * = 12(mod 36). The solutions to this 
congruence are those * such that ind 2 (*) = 4, 10, 16, 22, 28, 34(mod 36). From the Table 
of Indices (Table 2), the solutions are those x with x = 16, 25, 9, 21, 12, 28 (mod 37). 

2. To solve 7 X = 6 (mod 17) take indices to the base 3 (3 is the smallest primitive 
root of 17) to obtain ind3(7 x ) = inda6 = 15 (mod 16). Since ind3(7 x ) = x ■ inds7 = 
11* (mod 16), it follows that 11* = 15 (mod 16). Since all the steps in this computation 
are reversible, it follows that the solutions of the original congruence are the solutions 
of this linear congruence, namely those * with * = 13 (mod 16). 


4.7.3 QUADRATIC RESIDUES 
Definitions: 

If to and k are positive integers and a is an integer relatively prime to to, then a is 
a kth power residue of to if the congruence x k = a (mod to) has a solution. 

If a and to are relatively prime integers and to is positive, then a is a quadratic 
residue of to if the congruence * 2 = a (mod to) has a solution. If * 2 = a (mod to) has 
no solution, then a is a quadratic nonresidue of to. 

If p is an odd prime and p does not divide a, then the Legendre symbol (^) is 1 if a 
is a quadratic residue of p and — 1 if a is a quadratic nonresidue of p. This symbol is 
named after the French mathematician Adrien-Marie Legendre (1752-1833). 
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If n is an odd positive integer with prime-power factorization n = Pi 1 P2 t2 ■ ■ ■Pm tm and 
a is an integer relatively prime to n, then the Jacobi symbol (-) is defined by 

(;)= fl(s)“. 

2—1 

where the symbols on the right-hand side of the equality are Legendre symbols. This 
symbol is named after the German mathematician Karl Gustav Jacob Jacobi (1804- 
1851). 

Let a be a positive integer that is not a perfect square and such that a = 0 or 1 (mod 4) . 
The Kronecker symbol (named after the German mathematician Leopold Kronecker 
(1823-1891)), which is a generalization of the Legendre symbol, is defined as: 

, a \ _ J 1 if a = 1 (mod 8) 

* I 2 J \ — 1 if a = 5 (mod 8) 

* (p) = the Legendre symbol (^) if p is an odd prime such that p does not divide a 

* (0) = n (wr) 3 if gcd(a, n) = 1 and n = J([ pj tj is the prime factorization of n. 

j=i Pl j~ 1 

Facts: 

1. If p is an odd prime, then there are an equal number of quadratic residues modulo p 
and quadratic non-residues modulo p among the integers 1, 2, . . . ,p — 1. In particular, 
there are Gjf integers of each type in this set. 

2. Euler’s criterion: If p is an odd prime and a is a positive integer not divisible by p, 
then (^) = a( p-1 )/ 2 (mod p). 

3. If p is an odd prime and a and b are integers not divisible by p with a = b (mod p), 
then (|) = (|). 

4. If p is an odd prime and a and b are integers not divisible by p, then (|) (|) = ( “J' ) . 

5. If p is an odd prime and a and b are integers not divisible by p, then (^-) = 1. 

6. If p is an odd prime, then (-y) 

7. If p is an odd prime, then —1 is a quadratic residue of p if p = 1 (mod 4) and a 
quadratic nonresidue of p if p = — 1 (mod 4). (This is a direct consequence of Fact 6.) 

8. Gauss’ lemma: If p is an odd prime, a is an integer with gcd(a,p) = 1, and s is the 
number of least positive residues of a, 2a , ... , a greater than |, then (^) = (— l) s . 

9. If p is an odd prime, then (|) = (—1)^ 

10. The integer 2 is a quadratic residue of all primes p with p = ±1 (mod 8) and a 
quadratic nonresidue of all primes p = ±3 (mod 8). (This is a direct consequence of 
Fact 9.) 

11. Law of quadratic reciprocity: If p and q are odd primes, then 

(?)(*) = (-1)^' 

This law was first proved by Carl Friedrich Gauss (1777-1855). 

12. Many different proofs of the law of quadratic reciprocity have been discovered. By 
one count, there are more than 150 different proofs. Gauss published eight different 
proofs himself. 

13. The law of quadratic reciprocity implies that if p and q are odd primes, then 
( g) = (p ) if either p = 1 (mod 4) or q = 1 (mod 4), and (|) = — if p = q = 3 (mod 4). 


f 1 if p = 1 (mod 4) 

| —1 if p = — 1 (mod 4). 
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14 . If to is an odd positive integer and a and b are integers relatively prime to to with 
a = b (mod to) , then (£) = (£). 

15 . If to is an odd positive integer and a and b are integers relatively prime to to, then 
( — ) = ( — ) (— ). 

16 . If to is an odd positive integer and a is an integer relatively prime to to, then 
(-) = L 

17 . If m and n are relatively prime odd positive integers and a is an integer relatively 
prime to to and n, then ( — ) = ( — 

18 . If to is an odd positive integer, then the value of the Jacobi symbol (yy) does not 
determine whether a is a perfect square modulo to. 

19 . If m is an odd positive integer, then (y^) = (— l)^ - . 

20 . If to is an odd positive integer, then (^) = (— 1) » . 

21 . Reciprocity law for Jacobi symbols: If m and n are relatively prime odd positive 
integers, then 

22 . The number of integers in a reduced set of residues modulo n with (y)= 1 equals 
the number with (y)= — 1. 

23 . The Legendre symbol where p is prime and 0 < a < p, can be evaluated using 
0((log 2 p) 2 ) bit operations. 

24 . The Jacobi symbol (^), where n is a positive integer and 0 < a < n, can be 
evaluated using 0((log 2 rc) 2 ) bit operations. 

25 . Let p be an odd prime. Even though half the integers x with 1 < x < p are 
quadratic non-residues of p, there is no known polynomial-time deterministic algorithm 
for finding such an integer. However, picking integers at random produces a probabilistic 
algorithm that has 2 as the expected number of iterations done before a non-residue is 
found. 

26 . Let to be a positive integer with a primitive root. If k is a positive integer and a 

is an integer relatively prime to to, then a is a fctli power residue of to if and only if 
a <l>(m)/d = i (mod to) where d = gcd(fc, Moreover, if a is a fcth power residue 

of to, then there are exactly d incongruent solutions modulo to of the congruence x k = 
a (mod to) . 

27 . If p is a prime, k is a positive integer, and a is an integer with gcd(a,p) = 1, then a 
is a fctli power residue of p if and only if a bp~ 1 i/ d = 1 (mod p), where d = gcd (k,p — 1). 

28 . The fctli roots of a fctli power residue modulo p , where p is a prime, can be computed 
using a primitive root and indices to this primitive root. This is only practical for small 
primes p. (See §4.7.1.) 


Examples: 

1. The integers 1, 3, 4, 5, and 9 are quadratic residues of 11; the integers 2, 6, 7, 8, 
and 10 are quadratic nonresidues of 11. Hence (yy) = (^) = (yy-) = (yy) = (yy) = 1 

and (n) = (n) = (h) = (it) = (it) = - 1 - 

2 . To determine whether 11 is a quadratic residue of 19, note that using the law of 
quadratic reciprocity (Fact 12) and Facts 3, 4, and 10 it follows that (y|) = — (yy) = 

-( A ) = -(^) 3 = -(- D 3 = i - 
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3. To evaluate the Jacobi symbol (Jb) no t e that ( 4 2 -) = ( 3 I- 5 ) = (§) 2 ' (§) = 
(- 1 ) 2 (- 1 ) = - 1 . 

4 . The Jacobi symbol (^) = 1, but 5 is not a quadratic residue of 21. 

5 . The integer 6 is a fifth power residue of 101 since 6^ 101-1 ^ 5 = 6 20 = 1 (mod 101). 

6. From Example 5 it follows that 6 is a fifth power residue of 101. The solutions 
of the congruence x 5 = 6 (mod 101), the fifth roots of 6, can be found by taking 
indices to the primitive root 2 modulo 101. Since ind26 = 70, this gives ind22r = 
5 • iiuUx = 70 (mod 100). The solutions of this congruence are the integers x with 
ind2a: = 14 (mod 20). This implies that the fifth roots of 6 are the integers with 
ind2:r = 14, 34, 54, 74, and 94. These are the integers x with x = 22, 70, 85, 96, and 30 
(mod 101). 

7 . The integer 5 is not a sixth power residue of 17 since 5« cd ( 6 ' 16 > = 5 8 = — 1 (mod 17). 


4.7.4 MODULAR SQUARE ROOTS 
Definition: 

If m is a positive integer and a is an integer, then r is a square root of a modulo m 
if r 2 = a (mod m). 

Facts: 

1. If p is a prime of the form 4 n + 3 and a is a perfect square modulo p, then the two 
square roots of a modulo p are ±a(p+ 1 )/ 4 . 

2 . If p is a prime of the form 811 + 5 and a is a perfect square modulo p, then the 
two square roots of a modulo p are x = ±a^ p+3 ^ 8 (mod p) if a^ p_1 ^ 4 = 1 (mod p) and 
x = i2( p_1 )/ 4 a(p+ 3 )/ 8 (mod p) if a(p -1 )/ 4 = —1 (mod p). 

3 . If n is a positive integer that is the product of two distinct primes p and q and a 
is a perfect square modulo n, then there are four distinct square roots of a modulo n. 
These square roots can be found by finding the two square roots of a modulo p and the 
two square roots of a modulo q and then using the Chinese remainder theorem to find 
the four square roots of a modulo n. 

4. A square root of an integer a that is a square modulo p , where p is an odd prime, 
can be found by an algorithm that uses an average of 0((log 2 p) 3 ) bit operations. (See 
[MevaVa96].) 

5 . If n is an odd integer with r distinct prime factors, a is a perfect square modulo n, 
and gcd(a, n) = 1, then a has exactly 2 r incongruent square roots modulo n. 

Examples: 

1. Using Legendre symbols it can be shown that 11 is a perfect square modulo 19. Using 
Fact 1 it follows that the square roots of 11 modulo 19 are given by x = ill( 19+1 )/ 4 — 
ill 5 = ±7 (mod 19). 

2 . There are four incongruent square roots of 860 modulo 11021 = 103 • 107. To find 
these solutions, first note that x 2 = 860 = 36 (mod 103) so that x = ±6 (mod 103) and 
x 2 = 860 = 4 (mod 107) so that x = ±2 (mod 107). The Chinese remainder theorem 
can be used to find these square roots. They are x = —212, —109, 109, 212 (mod 11021). 

3 . The square roots of 121 modulo 315 are 11, 74, 101, 151, 164, 214, 241, and 304. 
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4.8 DIOPHANTINE EQUATIONS 

An important area of number theory is devoted to finding solutions of equations where 
the solutions are restricted to belong to the set of integers, or some other specified 
set, such as the set of rational numbers. An equation with the added proviso that the 
solutions must be integers (or must belong to some other specified countable set, such 
as the set of rational numbers) is called a diophantine equation. This name comes from 
the ancient Greek mathematician Diophantus (ca. 250 A.D.), who wrote extensively on 
such equations. 

Diophantine equations have both practical and theoretical importance. Their prac- 
tical importance arises when variables in an equation represent quantities of objects, for 
example. Fermat’s last theorem, which states that there are no nontrivial solutions in 
integers n > 2, x, y, and z to the diophantine equation x n + y n = z n has long interested 
mathematicians and non-mathematicians alike. This theorem was proved only in the 
mid-1990s, even though many brilliant scholars sought a proof during the last three 
centuries. 

More information about diophantine equations can be found in [Di71], [Gu94], and 
[Mo69] . 


4.8.1 LINEAR DIOPHANTINE EQUATIONS 


Definition: 

A linear diophantine equation is an equation of the form aiXi+a 2 X 2 +- ■ -+a n x n = c, 
where c, oq , . . . ,a n are integers and where integer solutions are sought for the unknowns 

X \ , X2 5 • • • 5 

Facts: 

1. Let a and b be integers with gcd(a, b) = d. The linear diophantine equation ax+by = 

c has no solutions if d/c |. If d\c, then there are infinitely many solutions in integers. 
Moreover, if x = xq, y = yo is a particular solution, then all solutions are given by 
x = Xo + 2 n i V = V o ~ where n is an integer. 

2. A linear diophantine equation a\X\ + < 12 X 2 + • • • + a n x n = c has solutions in integers 
if and only if gcd(ai, < 22 , ... , a n )\c. In that case, there are infinitely many solutions. 

3. A solution (xo, yo) of the linear diophantine equation ax + by = c where gcd(a, b)\c 
can be found by first expressing gcd(a, b) as a linear combination of a and b and then 
multiplying by c/gcd(a, b). (See §4.1.2.) 

4. A linear diophantine equation a\X\ + 02*2 + • • • + a n x n = c in n variables can be 
solved by a reduction method. To find a particular solution, first let b = gcd(a 2 , . . . , a n ) 
and let (x±,y) be a solution of the diophantine equation a\X\ + by = c. Iterate this 
procedure on the diophantine equation in n—1 variables, a 2 X 2 + 03*3 + • • • + a n x n = y 
until an equation in two variables is obtained. 

5. The solution to a system of r linear diophantine equations in n variables is obtained 
by using Gaussian elimination (§6.5.1) to reduce to a single diophantine equation in two 
or more variables. 

6. If a and b are relatively prime positive integers and n is a positive integer, then the 
diophantine equation ax+by = n has a nonnegative integer solution if n > (a — 1)(6— 1). 
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7. If a and b are relatively prime positive integers, then there are exactly (a— 1)(6— 1)/2 
nonnegative integers n less than ab — a — b such that the equation ax + by = n has a 
nonnegative solution. 

8. If a and b are relatively prime positive integers, then there are no nonnegative 
solutions of ax + by = ab — a — b. 

Examples: 

1. To solve the linear diophantine equation 17a: + 13 y = 100, express gcd(17, 13) = 1 
as a linear combination of 17 and 13. Using the steps of the Euclidean algorithm, it 
follows that 4 • 13 — 3 • 17 = 1. Multiplying by 100 yields 100 = 400 • 13 — 300 • 417. All 
solutions are given by x = 400 + 17t, y = —300 — 13f, where t ranges over the set of 
integers. 

2. A traveller has exactly $510 in travelers checks where each check is either a $20 or 
a $50 check. How many checks of each denomination can there be? 

The solution to this question is given by the set of solutions in nonnegative integers 
to the linear diophantine equation 20x + 50y = 510. There are infinitely many solutions 
in integers, which can be shown to be given by x = —102 + 5 n, y = 51 — 2 n. Since 
both x and y must be nonnegative, it follows that n = 21, 22, 23, 24, or 25. Therefore 
there are 3 $20 checks and 9 $50 checks, 8 $20 checks and 7 $50 checks, 13 $20 checks 
and 5 $50 checks, 18 $20 checks and 3 $50 checks, or 23 $20 checks and 1 $50 check. 

3 . To find a particular solution of the linear diophantine equation 12xi + 21x2 + 9 x 3 + 
15x4 = 9, which has infinitely many solutions since gcd(12, 21, 9, 15) = 3, which di- 
vides 9, first divide both sides of the equation by 3 to get 4xi + 7x 2 + 3 x 3 + 5 x 4 = 3. 
Now 1 = gcd(7,3,5), so solve 4xi + ly = 3, as in Example 1, to get Xi = 1 ,y = — 1. 
Next solve 7x 2 + 3 x 3 + 5 x 4 = —1- Since 1 = gcd(3,5), solve 7x 2 + lz = — 1 to get 
x 2 = 1, z = —8. Finally, solve 3 x 3 + 5 x 4 = —8 to get X 3 = —1, X 4 = —1. 

4 . To solve the following system of linear diophantine equations in integers: 

x + y + z + w= 100 
x H~ 2y 3z -j- Aw = 300 
x -\~ Ay H - 9~ -\~ 16w = 1000, 
first reduce the system by elimination to: 

x + y + z + w = 100 
y + 2z + 3 w = 200 
2z + 6w = 300. 

The solution to the last equation is z = 150 + 3 1, w = — t, where t is an integer. 
Back-substitution gives 

y = 200 - 2(150 + 3 1) - 3 (-i) = -100 - 3 1 
x = 100 - (-100 - 3 1) - (150 + 3t) - (-t) = 50 + t. 


4.8.2 PYTHAGOREAN TRIPLES 
Definitions: 

A Pythagorean triple is a solution (x, y , z) of the equation x 2 + y 2 = z 1 where x, y, 
and 2 are positive integers. 

A Pythagorean triple is primitive if gcd (x,y,z) = 1. 
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Facts: 

1 . Pythagorean triples represent the lengths of sides of right triangles. 

2. All primitive Pythagorean triples are given by 

x = 2 mn, y = m 2 — n 2 , z = m 2 + n 2 

where m and n are relatively prime positive integers of opposite parity with m > n. 


3. All Pythagorean triples can be found by taking 

x = 2 mnt, y = (to 2 — n 2 )t, z = ( m 2 + n 2 )t 
where t is a positive integer and m and n are as in Fact 2. 


4. Given a Pythagorean triple (x, y, z) with y odd, then m and n from Fact 2 can be 
found by taking to = J and n = J 


5. The following table lists all Pythagorean triples with z < 100. 
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6. The solutions of the diophantine equation x 2 + y 2 = 2 z 2 can be obtained by trans- 
forming this equation into (5±^) 2 _|_ = z 2 , which shows that (^|^, z) 

is a Pythagorean triple. All solutions are given by x = (m 2 — n 2 + 2 mri)t, y = 
(m 2 — n 2 — 2 mn)t, z = (to 2 + n 2 )t where m, n, and t are integers. 

7. The solutions of the diophantine equation a; 2 +2y 2 = z 2 are given by x = (to 2 — 2n 2 )t, 
y = 2 mnt, z = to 2 + 2n 2 where to., n, and f are positive integers. 

8. The solutions of the diophantine equation a; 2 + y 2 + 2 2 = ic 2 where y and z are 

even are given by x — m , y = 2 to, 2 = 2 n, w = m +r , where to and n are 

positive integers and r runs through the divisors of to 2 + n 2 less than (to 2 + n 2 ) 1 / 2 . 

9. The solutions of the diophantine equation x 2 + y 2 = z 2 + w 2 , with x > z, are given 

by x = y = 2 = w = where if to and n are both odd, 

then r and s are either both odd or both even. 


4.8.3 FERMAT’S LAST THEOREM 
Definitions: 

The Fermat equation is the diophantine equation x n + y n = z n where x, y, z are 
integers and n is a positive integer greater than 2. 

A nontrivial solution to the Fermat equation x n +y n = z n is a solution in integers x, y, 
and z where none of x, y, and 2 are zero. 

Let p be an odd prime and let /C = Q{d) be the degree-p cyclotomic extension of the 
rational numbers (§5.6.2). If p does not divide the class number of 1C (see [Co93]), then p 
is said to be regular. Otherwise p is irregular . 

Facts: 

1. Fermat’s last theorem: The statement that the diophantine equation x n + y n = z n 
has no nontrivial solutions in the positive integers for n > 3, is called Fermat’s last 
theorem. The statement was made more than 300 years ago by Pierre de Fermat (1601- 
1665) and resisted proof until recently. 

2. Fermat wrote in the margin of his copy of the works of Diophantus, next to the 
discussion of the equation x 2 + y 2 = z 2 , the following: “However, it is impossible to 
write a cube as the sum of two cubes, a fourth power as the sum of two fourth powers and 
in general any power the sum of two similar powers. For this I have discovered a truly 
wonderful proof, but the margin is too small to contain it.” In spite of this quotation, 
no proof was found of this statement until 1994, even though many mathematicians 
actively worked on finding such a proof. Most mathematicians would find it shocking if 
Fermat actually had found a proof. 

3 . Fermat’s last theorem was finally proved in 1995 by Andrew Wiles [Wi95]. Wiles 
collected the Wolfskehl Prize, worth approximately $50,000 in 1997 for this proof. 

4 . That there are no nontrivial solutions of the Fermat equation for n = 4 was demon- 
strated by Fermat with an elementary proof using the method of infinite descent. This 
method proceeds by showing that for every solution in positive integers, there is a so- 
lution such that the values of each of the integers x, y, and 2 is smaller, contradicting 
the well-ordering property of the set of integers. 

5 . The method of infinite descent invented by Fermat can be used to show that the more 
general diophantine equation x 4 + y 4 = z 2 has no nontrivial solutions in integers x, y, 
and z. 
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6 . The diophantine equation x 4 — y 4 = z 2 has no nontrivial solutions, as can be shown 
using the method of infinite descent. 

7. The sum of two cubes may equal the sum of two other cubes. That is, there are 
nontrivial solution of the diophantine equation x 3 + y 3 = z 3 +w 3 . The smallest solution 
is x = 1, y = 12, z = 9, w = 10. 

8 . The sum of three cubes may also be a cube. In fact, the solutions of x 3 +y 3 + z 3 = w 3 
are given by x = 3o 2 +56(a— 6), y = 4a(a— b)+6b 2 , z = 5a(a— b) — 3b 2 , w = 6a 2 — 46(a+6) 
where a and b are integers. 

9. Euler conjectured that there were four fourth powers of positive integers whose sum 
is also the fourth power of an integer. In other words, he conjectured that there are 
nontrivial solutions to the diophantine equation v 4 + w 4 + x 4 + y 4 = z 4 . The first such 
example was found in 1911 when it was discovered (by R.Norrie) that 30 4 + 120 4 + 
272 4 + 315 4 = 353 4 . 

10 . Euler also conjectured that the sum of the fourth powers of three positive integers 
can never be the fourth power of an integer and that the sum of fifth powers of four 
positive integers can never be the fifth power of an integer, and so on. In other words, 
he conjectured that there were no nontrivial solutions to the Diophantine equations 
w 4 + x 4 + y 4 = z 4 , v 5 + w 5 + x 5 + y 5 = z 5 , and so on. He was mistaken. The 
smallest counterexamples known are 95,800 4 + 217, 519 4 + 414, 560 4 = 422, 481 4 and 
27 5 + 84 5 + 110 5 + 133 5 = 144 5 . 

11 . If n = mp for some integer m and p is prime, then the Fermat equation can be 
rewritten as ( x m ) p + ( y m ) p = ( z m ) p . Since the only positive integers greater than 2 
without an odd prime factor are powers of 2 and x 4 + y 4 = z 4 has no nontrivial solutions 
in integers, Fermat’s last theorem can be demonstrated by showing that x p + y p = z p 
has no nontrivial solutions in integers x, y , and z when p is an odd prime. 

12 . An odd prime p is regular if and only if it does not divide the numerator of any of 

the numbers Ffy -B 4 , . . . , 3 , where B /. is the fcth Bernoulli number. (See §3.1.4.) 

13 . There is a relatively simple proof of Fermat’s last theorem for exponents that are 
regular primes. 

14 . The smallest irregular primes are 37, 59, 67, 101, 103, 149, and 157. 

15 . Wiles’ proof of Fermat’s last theorem is based on the theory of elliptic curves. 
The proof is based on relating to integers a, 6 , c, and n that supposedly satisfy the 
Fermat equation a n + b n = c n the elliptic curve y 2 = x(x + a n )(x — b n ) (called the 
associated Frey curve) and deriving a contradiction using sophisticated results from 
the theory of elliptic curves. (See Wiles’ original proof [Wi95], the popular account 
[Si97],and http://www.best.com/cgd/home/flt/flt 01 .htm (The Mathematics of 
Fermat’s Last Theorem) and http://www.pbs.org/wgbh/nova/proof/ (NOVA Online 
| The Proof) for more details.) 


4.8.4 PELL’S, BACHET’S, AND CATALAN’S EQUATIONS 
Definitions: 

Pell’s equation is a diophantine equation of the form x 2 — dy 2 = 1, where d is a square- 
free positive integer. This diophantine equation is named after John Pell (1611-1685). 

Bachet’s equation is a diophantine equation of the form y 2 = x 3 + k. This diophantine 
equation is named after Claude Gaspar Bachet (1587-1638). 
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Catalan’s equation is the diophantine equation x m — y n = 1, where a solution is 
sought with integers x>0,y>0,m>l, and n > 1. This diophantine equation is 
named after Eugene Charles Catalan (1814-1894). 

Facts: 

1. If x,y is a solution to the diophantine equation x 2 — dy 2 = n with d squarefree and 
n 2 < d, then the rational number | is a convergent of the simple continued fraction for 

Vd. (See §4.9.2.) 

2. An equation of the form ax' 2 + bx' + c = y' 2 can be transformed by means of the 
relations x = 2 ax' + b and y = 2 y' into an equation of the form x 2 — dy 2 = n, where 
n = b 2 — 4ac and d= a. 

3 . It is ironic that John Pell apparently had little to do with finding the solutions to 
the diophantine equation x 2 — dy 2 = 1. Euler gave this equation its name following a 
mistaken reference. Fermat conjectured an infinite number of solutions to this equation 
in 1657; this was eventually proved by Lagrange in 1768. 

4 . Let x, y be the least positive solution to x 2 — dy 2 = 1, with d squarefree. Then every 
positive solution is given by 

Xk + UkVd = (x + y\fd) k 
where k ranges over the positive integers. 

5 . Table 1 gives the smallest positive solutions to Pell’s equation x 2 — dy 2 = 1 with d 
a squarefree positive integer less than 100. 

6. If k = 0, then the formulae x = t 2 ,y = t 3 give an infinite number of solutions to the 
Bachet equation y 2 = x 3 + k. 

7. There are no solutions to Bachet’s equation for the following values of k : —144, 
-105, -78, -69, -42, -34, -33, -31, -24, -14, -5, 7, 11, 23, 34, 45, 58, 70. 

8 . The following table lists solutions to Bachet’s equation for various values of k : 


k 

X 

0 

t 2 ( t any integer) 

1 

0,-1, 2 

17 

-1,-2,2,4,8,43,52,5334 

-2 

3 

-4 

2,5 

-7 

2,32 

-15 

1 


9 . If k < 0, k is squarefree, k = 2 or 3 (mod 4), and the class number of the field 
Q(y/—k) is not a multiple of 3, then the only solution of the Bachet equation y 2 = x 3 + k 
for x is given by whichever of —(4 k ± l)/3 is an integer. The first few values of such k 
are 1, 2, 5, 6, 10, 13, 14, 17, 21, and 22. 

10 . Solutions to the Catalan equation give consecutive integers that are powers of 
integers. 

11. The Catalan equation has the solution x = 3, y = 2, m = 2, n = 3, so 8 = 2 3 and 
9 = 3 2 are consecutive powers of integers. The Catalan conjecture is that this is the 
only solution. 

12. Levi benGerson showed in the 14th century that 8 and 9 are the only consecutive 
powers of 2 and 3, so that the only solution in positive integers of 3 m — 2 ra = ±1 is 
m = 2 and n = 3. 
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Table 1 Smallest positive solutions to Pell’s equation x 2 - dy 2 = 1 with d 
squarefree, e£<100. 


d 

X 

y 

d 

X 

y 

2 

3 

2 

51 

50 

7 

3 

2 

l 

53 

66,249 

9,100 

5 

9 

4 

55 

89 

12 

6 

5 

2 

57 

151 

20 

7 

8 

3 

58 

19,603 

2,574 

10 

19 

6 

59 

530 

69 

11 

10 

3 

61 

1,766,319,049 

226,153,980 

13 

649 

180 

62 

63 

8 

14 

15 

4 

65 

129 

16 

15 

4 

1 

66 

65 

8 

17 

33 

8 

67 

48,842 

5,967 

19 

170 

39 

69 

7,775 

936 

21 

55 

12 

70 

251 

30 

22 

197 

42 

71 

3,480 

413 

23 

24 

5 

73 

2,281,249 

267,000 

26 

51 

10 

74 

3,699 

430 

29 

9,801 

1,820 

77 

351 

40 

30 

11 

2 

78 

53 

6 

31 

1,520 

273 

79 

80 

9 

33 

23 

4 

82 

163 

18 

34 

35 

6 

83 

82 

9 

35 

6 

1 

85 

285,769 

30,996 

37 

73 

12 

86 

10,405 

1,122 

38 

37 

6 

87 

28 

3 

39 

25 

4 

89 

500,001 

53,000 

41 

2,049 

320 

91 

1,574 

165 

42 

13 

2 

93 

12,151 

1,260 

43 

3,482 

531 

94 

2,143,295 

221,064 

46 

24,335 

3,588 

95 

39 

4 

47 

48 

7 

97 

62,809,633 

6,377,352 


13. Euler proved that the only solution in positive integers of x 3 — y 2 = ±1 is x = 2 
and y = 3. 

14. Lebesgue showed in 1850 that x m — y 2 = 1 has no solutions in positive integers 
when m is an integer greater than 3. 

15. The diophantine equations x 3 — y n = 1 and x m — y 3 = 1 with m > 2 were shown to 
have no solutions in positive integers in 1921, and in 1964 it was shown that x 2 — y n = 1 
has no solutions in positive integers. 

16. R. Tijdeman showed in 1976 that there are only finitely many solutions in integers 
to the Catalan equation x m — y n = 1 by showing that there is a computable constant C 
such that for every solution, x m < C and y n < C. However, the enormous size of the 
constant C makes it infeasible to establish the Catalan conjecture using computers. 
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Examples: 


1. To solve the diophantine equation x 2 — 13 y 2 = 1, note that the simple continued frac- 
tion for is [3; 1,1, 1,1, 6 ], with convergents 3,4, |, f§|, f|g , . . . . 

The smallest positive solution to the equation is x = 649, y = 180. A second solution is 
given by (649 + 180713 ) 2 = 842,401 + 233,640V©}, that is, x = 842,401, y = 233,640. 


2 . Congruence considerations can be used to show that there are no solutions of Ba- 
chet’s equation for k = 7. Modulo 8 , every square is congruent to 0, 1, or 4; therefore 
if x is even, then y 2 = 7 (mod 8 ), a contradiction. Likewise if x = 3 (mod 4), then 
y 2 = 2 (mod 8 ), also impossible. So assume that x = 1 (mod 4). Add one to both sides 
and factor to get y 2 + 1 = x 3 + 8 = (x + 2)(x 2 — 2x + 4). Now x 2 — 2x + 4 = 3 (mod 4), 
so it must have a prime divisor p = 3 (mod 4). Then y 2 = — 1 (mod p), which implies 
that —1 is a quadratic residue modulo p. (See §4.4.5.) But p = 3 (mod 4), so —1 cannot 
be a quadratic residue modulo p. Therefore, there are no solutions when k = 7 


4.8.5 SUMS OF SQUARES AND WARING’S PROBLEM 
Definitions: 

If k is a positive integer, then g(k) is the smallest positive integer such that every 
positive integer can be written as a sum of g(k) fcth powers. 

If k is a positive integer, then G(k) is the smallest positive integer such that every 
sufficiently large positive integer can be written as a sum of G(k) /cth powers. 

The determination of g(k) is called Waring’s problem. (Edward Waring, 1741-1793) 

Facts: 

1. A positive integer n is the sum of two squares if and only if each prime factor of n 
of the form 4fc + 3 appears to an even power in the prime factorization of n. 

2. If m = a 2 + b 2 and n = c 2 + d 2 , then the number mn can be expressed as the sum 
of two squares as follows: mn = ( ac + bd) 2 + (ad — be) 2 . 

3. If n is representable as the sum of two squares, then it is representable in 4(d\ — dz) 
ways (where the order of the squares and their signs matter) , where d\ is the number of 
divisors of n of the form 4k +1 and d 3 is the number of divisors of n of the form 4k + 3. 

4. An integer n is the sum of three squares if and only if n is not of the form 4 m (8fc+7), 
where m is a nonnegative integer. 

5. The positive integers less than 100 that are not the sum of three squares are 
7, 15, 23, 28, 31, 39, 47, 55, 60, 63, 71, 79, 87, 92, and 95. 

6 . Lagrange’s four-square theorem: Every positive integer is the sum of 4 squares, 
some of which may be zero. (Joseph Lagrange, 1736-1813) 

7. A useful lemma due to Lagrange is the following. If m = a 2 + b 2 + c 2 + d 2 and 
n = e 2 + f 2 + g 2 + h 2 , then mn can be expressed as the sum of four squares as follows: 
mn = (ae+bf+cg+dh) 2 + (af — be+ch — dg) 2 + (ag — ce+df — bh) 2 + (ah — de+bg — cf) 2 . 

8 . The number of ways n can be written as the sum of four squares is 8 (s — S4), where s 
is the sum of the divisors of n and S4 is the sum of the divisors of n that are divisible 
by 4. 

9. It is known that g(k) always exists. 
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10 . For 6 < k < 471,600,000 the following formula holds except possibly for a finite 
number of positive integers k: g(k ) = [_( §) fc J + 2 fc — 2 where [ycj represents the floor 
(greatest integer) function. 

11 . The exact value of G(k) is known only for two values of k, G( 2) = 4 and G( 4) = 16. 

12 . From Lagrange’s results above it follows that G(2) = g( 2) = 4. 

13 . If k is an integer with k > 2, then G(k ) < g(k). 

14 . Iffc is an integer with k > 2, then G(k) > k + 1. 

15 . Hardy and Littlewoocl showed that G(k) < (k — 2)2 fe_1 + 5 and conjectured that 
G(k ) < 2k + 1 when k is not a power of 2 and G(k ) < 4/c when k is a power of 2. 

16 . The best upper bound known for G(fc) is G(fc) < cfcln/c for some constant c. 

17 . The known values and established estimates for g[k ) and G(k) for 2 < /c < 8 are 
given in the following table. 


II 

cnT 

II 

3(3) = 9 

4 < G(3) < 7 

3(4) = 19 

G(4) = 16 

3(5) = 37 

6 < G(5) < 18 

3 ( 6 ) = 73 

9 < G( 6 ) < 27 

143 < 5 (7) < 3,806 

8 < G(7) < 36 

279 < 5 ( 8 ) < 36,119 

32 < G( 8 ) < 42 


18 . There are many related diophantine equations concerning sums and differences of 
powers. For instance x = 1, y = 12, z = 9, and w = 10 is the smallest solution to 
x 3 + y 3 = z 3 + w 3 . 


4.9 DIOPHANTINE APPROXIMATION 

Diophantine approximation is the study of how closely a number 9 can be approximated 
by numbers of some particular kind. Usually 9 is an irrational (real) number, and the 
goal is to approximate 9 using rational numbers |. 


4.9.1 CONTINUED FRACTIONS 
Definitions: 

A continued fraction is a (finite or infinite) expression of the form 

1 

o-o H 

CLi H 

02 H — 

03 H 

The terms do, ai, • . . are called the partial quotients. If the partial quotients are all 
integers, and Oi > 1 for i > 1, then the continued fraction is said to be simple. For 
convenience, the above expression is usually abbreviated as [oq, Oi, < 22 , < 23 , . . .]. 
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Algorithm 1 : The continued fraction algorithm. 

procedure CFA(x: real number) 

i := 0 

xo := x 

ao ■= |zoJ 

output (do) 

while (xj 7^ a,) 

begin 



i := i + 1 

at := [a© 
output(aj) 

end 

{returns finite or infinite sequence (ao, ai, . . .)} 


A continued fraction that has an expansion with a block that repeats after some point 
is called ultimately periodic. The ultimately periodic continued fraction expan- 
sion [ao, ai, . . . , oat, djv+i, ■ • • , ciN+k, djv+i, • • ■ , djv+fe, djv+i, ■ • •] is often abbreviated as 
[do, di, . . . , aw, ajv+Ti • ■ ■ , OAr+fc]. The terms do, di, . . . , djv are called the pre-period 
and the terms djv+i, djv+ 2 , • • • , aN+k are called the period. 

Facts: 

1. Every irrational number has a unique expansion as a simple continued fraction. 

2. Every rational number has exactly two simple continued fraction expansion, one 
with an odd number of terms and one with an even number of terms. Of these, the one 
with the larger number of terms ends with 1. 

3. The simple continued fraction for a real number r is finite if and only if r is rational. 

4. The simple continued fraction for a real number r is infinite and ultimately periodic 
if and only if r is a quadratic irrational. 

5. The simple continued fraction for vd, where d a positive integer that is not a 
square, is as follows: Vd = [ao, ai, 02 , ... , a n , 2ao ], where the sequence (ai, d 2 , ■ • ■ , a„) 
is a palindrome. 

6. The following table illustrates the three types of continued fractions. 


type 

kind of number 

example 

finite 

rational 

Iff =[3,7, 16] 

ultimately periodic 

quadratic irrational 

V2=[l,2,2,2,...] 

infinite, but not ulti- 

neither rational nor 

7 r= [3,7,15,1,292...] 

mately periodic 

quadratic irrational 


7 . The continued fraction for a real number can be computed by Algorithm 1. 

8. Continued fractions for Vd, for 2 < d < 100, are given in Table 1. 

9 . Continued fraction expansions for certain quadratic irrationals are given in Table 2. 

10. Continued fraction expansions for some famous numbers are given in Table 3. 
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Table 

1 Continued fractions for \Jc 

i, 2 

< d < 100. 

d 

\fd 

d 

Vd 

2 

[1,2] 

53 

[7,3,1,1,3,14] 

3 

[1,1,2] 

54 

[7,2,1,6,1,2,14] 

5 

[2,4] 

55 

[7,2,2,2,14] 

6 

[2,2,4] 

56 

[7,2,14] 

7 

[2, 1,1, 1,4] 

57 

[7,1,1,4,1,1,14] 

8 

[2,M] 

58 

[7,1,1,1,1,1,1,14] 

10 

[3,6] 

59 

[7,1,2,7,2,1,14] 

11 

[3,3,6] 

60 

[7,1,2,1,14] 

12 

[3,2,6] 

61 

[7,1,4,3,1,2,2,1,3,4,1,14] 

13 

[3, 1,1, 1,1, 6] 

62 

[7,1,6,1,14] 

14 

[3, 1,2, 1,6] 

63 

[7,1,14] 

15 

[3,1,6] 

65 

[8,16] 

17 

[4,8] 

66 

[8,8,16] 

18 

[4,4,8] 

67 

[8,5,2,1,1,7,1,1,2,5,16] 

19 

[4, 2, 1,3, 1,2, 8] 

68 

[8,4,16] 

20 

[4,278] 

69 

[8,3,3,1,4,1,3,3,16] 

21 

[4, 1,1, 2, 1,1, 8] 

70 

[8,2,1,2,1,2,16] 

22 

[4, 1,2, 4, 2, 1,8] 

71 

[8,2,2,1,7,1,2,2,16] 

23 

[4, 1,3, 1,8] 

72 

[8,2,16] 

24 

[4,1,8] 

73 

[8,1,1,5,5,1,1,16] 

26 

[5, TO] 

74 

[8,1,1,1,1,16] 

27 

[5,5,10] 

75 

[8,1,1,1,16] 

28 

[5,3,2,3,10] 

76 

[8,1,2,1,1,5,4,5,1,1,2,1,16] 

29 

[5,2,1,1,2,10] 

77 

[8,1,3,2,3,1,16] 

30 

[5,2,10] 

78 

[8,1,4,1,16] 

31 

[5,1,1,3,5,3,1,1,10] 

79 

[8,1,7,1,16] 

32 

[5,1,1,1,10] 

80 

[8,1,16] 

33 

[5,1,2,1,10] 

82 

[9,18] 

34 

[5,1,4,1,10 c] 

83 

[9, 9, 18v] 

35 

[5,1,10] 

84 

[9,6,18] 

37 

[6,12] 

85 

[9,4,1,1,4,18] 

38 

16,6,12] 

86 

[9,3,1,1,1,8,1,1,1,3,18] 

39 

[6,4,12] 

87 

[9,3,18] 

40 

[6,3,12] 

88 

[9,2,1,1,1,2,18] 

41 

[6,2,2,12] 

89 

[9,2,3,3,2,18] 

42 

[6,2,12] 

90 

[9,2,18] 

43 

[6,1,1,3,1,5,1,3,1,1,12] 

91 

[9,1,1,5,1,5,1,1,18] 

44 

[6,1,1,1,2,1,1,1,12] 

92 

[9,1,1,2,4,2,1,1,18] 

45 

[6,1,2,2,2,1,12] 

93 

[9,1,1,1,4,6,4,1,1,1,18] 

46 

[6,1,3,1,1,2,6,2,1,1,3,1,12] 

94 

[9,1,2,3,1,1,5,1,8,1,5,1,1,3,2,1,18] 

47 

[6,1,5,1,12] 

95 

[9,1,2,1,18] 

48 

[6,1,12] 

96 

[9,1,3,1,18] 

50 

[7,14] 

97 

[9,1,5,1,1,1,1,1,1,5,1,18] 

51 

[7,7,14] 

98 

[9,1,8,1,18] 

52 

[7, 4, 1,2, 1,4, 14] 

99 

[9,1,18] 
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Table 2 Continued fractions for some special quadratic irrationals. 


d 

continued fraction expansion 



\Jn 2 — 1 

[n — 1, 1, 2n — 2 ] 

Vn 2 - 2 
\Jn 2 + 1 

Vn 2 + 2 

[n — 1, 1, n — 2, 1, 2?z — 2 ] 

[n, 2n] 

[ n, n, 2 n ] 

\Jn 2 — n 
\Jn 2 + n 

V 4n 2 + 4 

[n — l,2,2n — 2] 

[n, 2, 2n] 

[2n, n, 4n] 

V 4n 2 — n 

4n 2 + n 
•\/9n 2 + 2 n 

[2n — 1, 1, 2, 1, 4n — 2] 

[2n, 4, 4n] 

[3n, 3, 6n] 


Table 3 Continued fractions for some famous numbers. (See [Pe54].) 


number 

continued fraction expansion 

77 

[3,7,15,1,292,1,1,1,2,1,3,1,14,2,1,1,2,2,2,2,1,84,2,1,1,15,3,...] 

7 

[0,1,1,2,1,2,1,4,3,13,5,1,1,8,1,2,4,1,1,40,1,11,3,7,1,7,1,1,5,...] 

\/2 

[1,3,1,5,1,1,4,1,1,8,1,14,1,10,2,1,4,12,2,3,2,1,3,4,1,1,2,14,...] 

log 2 

[0,1,2,3,1,6,3,1,1,2,1,1,1,1,3,10,1,1,1,2,1,1,1,1,3,2,3,1,13,7,...] 

e 

[2,1,2,1,1,4,1,1,6,1,1,8,1,1,10,1,1,12,...] 

i 

en 

[1, n — 1, 1,1, 3n — 1, 1,1, 5n — 1, 1,1, 7n-l,...] 

g 2rx+l 

[1, ( 6n + 3)k + n , (24?r + 12)/c + 12n + 6, (6?r + 3)fc + 5?r + 2, 1, l fc>0 ] 

tanh 1 

n 

[0, n, 3 n, 5n, 7n , . . . ] 

tan i 

[0, n — 1, 1, 3?i — 2, 1, 5n — 2, 1, 7n — 2, 1, 9n — 2, . . . ] 

l+v^ 

2 

[1,1,1,!,...] 


Examples: 

1 . To find the continued fraction representation of ft , apply Algorithm 1 to obtain 


62 

23 


— 2 + — 23 — 1_lJ_ 

— ^ * 23) yg — r T 18) 


16 

7 


= 2 + 


1 

T ) 


1 - 3 + \' 


Combining these equations shows that || = [2, 1, 2, 3, 2]. Since 2 = 1 + it also follows 
that § = [2, 1,2, 3, 1,1]. 

2. Applying Algorithm 1 to find the continued fraction of \/6, it follows that 

a 0 = [V6\ =2, ai = =2, a 2 = [n/6 + 2j = 4, a 3 = ai, a 4 = a 2 , 

Hence = [2,274]. 


3. The continued fraction expansion of e is e = [2, 1, 2, 1, 1, 4, 1, 1, 6, . . .] . This expansion 
is often abbreviated as [2, 1, 2fc, l fc>1 ]. (See [Pe54].) 
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4.9.2 CONVERGENTS 


Definition: 

Define p_ 2 = 0, g_ 2 = 1, p ~ i = 1, q ~ i = 0, and = a n p n - i + p n - 2 and f/„ = 
a n q n -i + q n -2 for n > 0. Then — [a 0 , ai, . . . , a„l. The fraction ^ is called the nth 
convergent. 

Facts: 

1. p n q n - 1 -p n -iq n = (-l) n+1 for n > 0. 

2. Let 9 = \an.a 1 .a 2 . . . .1 be an irrational number. Then \9 — — I < — - — -. 

3. If n > 1, 0 < q < q n , and - ^ — , then |0 — — ^1. 

) y _ Mrai q ' qn.' I 9 I I 9n I 

4. [...,a,b,0,c,d, a, b + c,d,...]. 

5. Almost all real numbers have unbounded partial quotients. 

6. For almost all real numbers, the frequency with which the partial quotient k occurs 
is log 2 (l + k (k +2 ) ) ■ Hence, the partial quotient 1 occurs about 41.5% of the time, the 
partial quotient 2 occurs about 17.0% of the time, etc. 

7. For almost all real numbers, 

lim (aio 2 . . . a n )n = K ss 2.68545. 

n— >oo 

K is called Khintchine’s constant. 

8. Levy’s law: For almost all real numbers, 

i j_ , 2 

lim (p„)» = lim (q n )* = e l21 °« 2 . 


Examples: 

1. Compute the first eight convergents to n: 


s 

II 

to 

i — 1 

o 

1 

2 

3 

4 

5 

6 

7 

8 

CL n — 3 

7 

15 

1 

292 

1 

1 

1 

2 

Pn = 0 13 

22 

333 

355 

103,993 104,348 

208,341 

312,689 

833,719 

q n = 1 0 1 

7 

106 

113 

33,102 33,215 

66,317 

99,532 

265,381 


2. Find a rational fraction | in lowest terms that approximates e to within 10 6 
Compute the convergents q n until a n +i (qn) 2 < 10 6 : 


n = —2 

-l 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

= 


2 

1 

2 

1 

1 

4 

1 

1 

6 

1 

1 

Pn = 0 

l 

2 

3 

8 

11 

19 

87 

106 

193 

1,264 

1,457 

2,721 

Qn 1 

0 

1 

1 

3 

4 

7 

32 

39 

71 

465 

536 

1,001 


Hence, « 2.71828171 is the desired fraction. 
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4.9.3 APPROXIMATION THEOREMS 


Facts: 

1. Dirichlet’s theorem : If 9 is irrational, then 


0-2 

9 


< q* 


for infinitely many p , q. 

2. Dirichlet’s theorem in d dimensions: If 0i, 02, ■ ■ • , 0<j are real numbers with at least 
one 9j irrational, then 


for infinitely many pi,p 2 , ■ ■■ ,Pd,q- 

3. Hurwitz’s theorem: If 6 is an irrational number, then 

1 

< 




V5 9 2 


for infinitely many p , q. The constant y/5 is best possible. 

4. Liouville’ s theorem: Let 9 be an irrational algebraic number of degree n. Then 
there exists a constant c (depending on 9) such that 


0-2 

9 




for all rationals | with q > 0. The number 9 is called a Liouville number if |0 — 1 1 < q n 
has a solution for all n > 0. An example of a Liouville number is Ylk> l 2 _fc! - 

5. Roth’s theorem: Let 9 be an irrational algebraic number, and let e be any positive 
number. Then 

\9-Z\ > * 


for all but finitely many rationals 2 with q > 0. 


4.9.4 IRRATIONALITY MEASURES 
Definition: 

Let 9 be a real irrational number. Then the real number p is said to be an irrationality 
measure for 9 if for every e > 0 there exists a positive real qo = qo(e) such that 
\9 — 2 1 > q-iv+t) for all integers p, q with q > q 0 . 

Fact: 

1. Here are the best irrationality measures known for some important numbers. 


number 9 

measure fi 

discoverer 

7 r 

8.0161 

Hata (1993) 

7T 2 

5.4413 

Rhin and Viola (1995) 

C(3) 

8.8303 

Hata (1990) 

In 2 

3.8914 

R.ukhadze (1987); Hata (1990) 

7 r 

4.6016 

Hata (1993) 
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4.10 QUADRATIC FIELDS 


4.10.1 BASICS 

Definitions: 

A complex number a is an algeraic number if it is a root of a polynomial with integer 
coefficients. 

An algebraic number a is an algebraic integer if it is a root of a monic polynomial 
with integer coefficients. (A monic polynomial is a polynomial with leading coefficient 
equal to 1.) 

An algebraic number a is of degree n if it is a root of a polynomial with integer 
coefficients of degree n but is not a root of any polynomial with integer coefficients of 
degree less than n. 

An algebraic number field is a subfield of the field of algebraic numbers. 

If a is an algebraic number with minimal polynomial fix') of degree n, then the n — 1 
other roots of f(x) are called the conjugates of a. 

The integers of an algebraic number field are the algebraic integers that belong to this 
field. 

If d is a squarefree integer, then Q(Vd) = {a + b^d \ a and b are rational numbers} 
is called a quadratic field. If d > 0, then Q{\fd) is called a real quadratic field ; if 
d< 0, then Q(Vd ) is called an imaginary quadratic Held. 

A number a in QiVd) is a quadratic integer (or an integer when the context is 
clear) if a is an algebraic integer. 

If a and f3 are quadratic integers in Q(Vd) and there is a quadratic integer 7 in Q(Vd) 
such that ay = (3, then a divides f3, written a|/3. 

The integers of <2(\A-1) are called the Gaussian integers. (These are the numbers in 
Z[i] = { a + bi | a, b are integers }. See §5.4.2.) 

If a = a + b\/d belongs to Q(Vd), then its conjugate , denoted by a, is the number 
a — b\fd. 

If a belongs to Q(Vd), then the norm of a is the number N(a) = ad. 

An algebraic integer e in Q(\/d) is a unit if e | 1. 

Facts: 

1. The integers of the field Q(\/d), where d is a squarefree integer, are the numbers 
a + b\/d when d = 2 or 3 (mod 4) and the numbers a +j^ , where a and b are integers 
which are either both even or both odd. 

2 . If d < 0, d ^ — 1, d 7^ —3, then there are exactly two units, ±1, in Q{Vd). There 
are exactly four units in Q(y/— 1), namely ±1 and ±i. There are exactly six units in 
Q( V / — 3): ±1, 

3. If d > 0, there are infinitely many units in Q(Vd). Furthermore, there is a unit eo, 
called the fundamental unit of Q(Vd ) such that all units are of the form ±eg where n 
is an integer. 
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Examples: 

1. The conjugate of —2 + 3 i in the ring of Gaussian integers is —2 — 3 i. Consequently, 
N (- 2 + 3 i) = (-2 - 3z)(— 2 + 3 i) = 13. 

2. The number 1 + y/2 is a fundamental unit of <2(\/2). Therefore, all units are of the 
form ±(1 + V2) n where n = 0, ±1, ±2, .... 


4.1 0.2 PRIMES AND UNIQUE FACTORIZATION 

Definitions: 

An integer n in Q(Vd), not zero or a unit, is prime in Q(Vd) if whenever tt = a/3 
where a and (3 are integers in Q(Vd), either a or (3 is a unit. 

If a and /3 are nonzero integers in Q(Vd) and a = f3e where e is a unit, then [3 is called 
an associate of a. 

A quadratic field Q(Vd) is a Euclidean Held if, given integers a and (3 in Q(Vd) 
where (3 is not zero, there are integers S and 7 in Q( \fd) such that a = 7/3 + S and 

\N(6)\ < \N(0)\- 

A quadratic field Q( \/ d) has the unique factorization property if whenever a is 
a nonzero, non-unit, integer in Q(y/d ) with two factorizations a = £717 7r2 . . . n r = 
e' ' k'i'k'z . . . tt's where e and e' are units, then r — s and the primes 77 and 7 r' can be 
paired off into pairs of associates. 

Facts: 

1. If a is an integer in Q{\fd) and N(a) is an integer that is prime, then a is a prime. 

2. The integers of Q(V~d) are a unique factorization domain if and only if whenever a 
prime n\a/3 where a and [3 are integers of Q(Vd), then 7r|a or tt\/3. 

3. A Euclidean quadratic held has the unique factorization property. 

4. The quadratic held Q(Vd) is Euclidean if and only if d is one of the following 
integers: -11, -7, -3, -2, -1, 2, 3, 5, 6, 7, 11, 13, 17, 19, 21, 29, 33, 37, 41, 57, 73. 

5. If d < 0, then the imaginary quadratic held Q(V~d) has the unique factorization 
property if and only if d = —1, —2, —3, —7, —11, —19, —43, —67, or —163. This theorem 
was stated as a conjecture by Gauss in the 19th century and proved in the 1960s by 
Harold Stark and Roger Baker independently. 

6. It is unknown whether infinitely many real quadratic helds Q(Vd) have the unique 
factorization property. 

7. Of the 60 real quadratic helds Q(Vd) with 2 < d < 100, exactly 38 have the unique 
factorization property, namely those with d = 2, 3, 5, 6, 7, 11, 13 14, 17, 19, 21, 22, 23, 
29, 31, 33, 37, 38, 41, 43, 46, 47, 53, 57, 59, 61, 62, 67, 69, 71, 73, 77, 83, 86, 89, 93, 94, 
and 97. 

Examples: 

1. The number 2 + i is a prime Gaussian integer. This follows since its norm A^(2 + i) = 
(2 + i) (2 — *) = 5 is a prime integer. Its associates are itself and the three Gaussian 
integers (— 1)(2 + z) = —2 — z, z(2 + z) = — 1 + 2 z, and — z(2 + z) = 1 — 2z. 
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2. The integers of Q(y/— 5) are the numbers of the form a + b\/—5 where a and b are 
integers. The field Q(y/— 5) is not a unique factorization domain. To see this, note 
that 6 = 2 • 3 = (1 + \/^5)(l — v / ~ 5) and each of 2, 3, 1 + y/—5, and 1 — \/C 5 are 
primes in this quadratic field. For example, to see that 1 + \f— 5 is prime, suppose that 
1 + \/— 5 = (a + fe\/~5)( c + dyf— 5). This implies that 6 = (a 2 + 5 6 2 )(c 2 + 5 d 2 ), which 
is impossible unless a = ±1, b = 0 or c = ±1, d = 0. Consequently, one of the factors 
must be a unit. 
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INTRODUCTION 


Many of the most common mathematical systems, including the integers, the rational 
numbers, and the real numbers, have an underlying algebraic structure. This chapter 
examines the structure and properties of various types of algebraic objects. These 
objects arise in a variety of settings and occur in many different applications, including 
counting techniques, coding theory, information theory, engineering, and circuit design. 


GLOSSARY 

abelian group: a group in which a*b = b* a for all a, b in the group. 

absorption laws: in a lattice a V (a A b) = a and a A (a V b) = a. 

algebraic element (over a field): given a field F, an element a £ K (extension of F) 
such that there exists p(x) £ F[x\ ( p(x ) ^ 0) such that p(a) = 0. Otherwise a is 

transcendental over F. 

algebraic extension (of a field) : given a field F, a field K such that F is a subfield 
of K and all elements of K are algebraic over F. 

algebraic integer: an algebraic number that is a zero of a monic polynomial with 
coefficients in Z. 

algebraic number: a complex number that is algebraic over Q. 

algebraic structure: (S,* i,* 2 , . . . ,*„) where S' is a nonempty set and *1 are 

binary or monadic operations defined on S. 

alternating group (on n elements): the subgroup A n of all even permutations in S n . 

associative property: the property of a binary operator * that (a*&)*c = a*(b*c). 

atom: an element a in a bounded lattice such that 0 < a and there is no element b 
such that 0 < b < a. 

automorphism: an isomorphism of an algebraic structure onto itself. 

automorphism ip fixes set S elementwise: ip(a) = a for all a £ S. 

binary operation (on a set S) : a function *: S x S — > S . 

Boolean algebra: a bounded, distributive, complemented lattice. Equivalent defini- 
tion: (B, 0, 1) where B is a set with two binary operations, + (addition) and • 

(multiplication), one monadic operation, ' (complement), and two distinct elements, 
0 and 1, that satisfy the commutative laws (a +6 = b + a, ab = ba ), distributive laws 
(a(b + c) = (ab) + (ac), a + (be) = (a + b)(a + c)), identity laws (a + 0 = a, al = a), 
and complement laws (a + a' = 1, aa' = 0). 

Boolean function of degree n: a function /: {0, 1}" = {0, 1} x • • • x {0, 1} — » {0, 1}. 

bounded lattice: a lattice having elements 0 (lower bound ) and 1 ( upper bound) 
such that 0 < a and a < 1 for all a. 

cancellation properties: if ab = ac and a / 0 . then b = c (left cancellation 
property); if ba = ca and « / 0, then b - c (right cancellation property) . 

characteristic (of a field): the smallest positive integer n such that l+l+--- + l = 0(n 
summands). If no such n exists, the field has characteristic 0 (or characteristic oo). 

closure property: a set S is closed under an operation * if the range of * is a subset 

of 5. 

commutative property : the property of an operation * that a * b = b* a. 
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commutative ring: a ring in which multiplication is commutative. 

complemented lattice: a bounded lattice such that for each element a there is an 
element b such that a V b = 1 and a A b = 0. 

conjunctive normal form (CNF) (of a Boolean function): a Boolean function writ- 
ten as a product of maxterms. 

coset: For subgroup H of group G and a £ G, a left coset is aH = {ah \ h £ H}', a 
right coset is Ha = {ha \ h £ H}. 

cycle of length n: a permutation on a set S that moves elements only in a single 
orbit of size n. 

cyclic group: a group G with an element a £ G such that G={a n \n£Z}. 

cyclic subgroup (generated by a): { a n \ n G Z } = {... , a -2 , a -1 , e, a, a 2 , . . .}, often 
written (a), (a), or [a]. The element a is a generator of the subgroup. 

degree (of field K over field F ): [K : F] = the dimension of K as a vector space over F. 

degree (of a permutation group): the size of the set on which the permutations are 
defined. 

dihedral group: the group D n of symmetries (rotations and reflections) of a regular 
n-gon. 

disjunctive normal form ( DNF ) (of a Boolean function): a Boolean function writ- 
ten as a sum of minterms. 

distributive lattice: a lattice that satisfies a A (b Vc) = (aAfr)V(aAc) andaV(6Ac) = 
(a V b) A (a V c) for all a, b, c in the lattice. 

division ring: a nontrivial ring in which every nonzero element is a unit. 

dual (of an expression in a Boolean algebra): the expression obtained by interchang- 
ing the operations + and • and interchanging the elements 0 and 1 in the original 
expression. 

duality principle: the principle stating that an identity between Boolean expressions 
remains valid when the duals of the expressions are taken. 

Euclidean domain: an integral domain with a Euclidean norm defined on it. 

Euclidean norm (on an integral domain): given an integral domain I, a function 
5:1 — {0} — > A f such that for all a,b G I, 5(a) < 5(ab)~ and for all a, d € I (d / 0) 
there are q,r £ I such that a = dq + r, where either r = 0 or 5(r) < 5(d). 

even permutation: a permutation that can be written as a product of an even number 
of transpositions. 

extension field (of field F): field K such that F is a subfield of K. 

field : an algebraic structure (F,+,-) where F is a set closed under two binary oper- 
ations + and •, (F, +) is an abelian group, the nonzero elements form an abelian 
group under multiplication, and the distributive law a-(b + c) = a- b + a- c holds. 

finite held: a field with a finite number of elements. 

hnitely generated group: a group with a finite set of generators. 

fixed held (of a set of automorphisms of a field): given a set d> of automorphisms of 
a field F, the set { a € F \ ay = a for all ip € d> } . 

free monoid (generated by a set): given a set S, the monoid consisting of all words 
on S under concatenation. 
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functionally complete: property of a set of operators in a Boolean algebra that every 
Boolean function can be written using only these operators. 

Galois extension (of a held F): a held K that is a normal, separable extension of F. 

Galois Held: GF(p n ) = the algebraic extension Z p [x]/(f(x)) of the finite held Z p 
where p is a prime and f{x) is an irreducible polynomial over Z p of degree n. 

Galois group (of K over F): the group of automorphisms G(K/F ) of held K that hx 
held F elementwise. 

group: an algebraic structure (G, *), where G is a set closed under the binary opera- 
tion *, the operation * is associative, G has an identity element, and every element 
of G has an inverse in G. 

homomorphism of groups: a function <p: S — > T, where (.S', *, ) and (T, * 2 ) are 
groups, such that ip(a * 1 b) = <p{a) * 2 p(b) for all a,b £ S. 

homomorphism of rings: a function p:S T, where (.S'. +., , j and (T, + 2 , - 2 ) are 
rings such that < p(a -© b) = tp(a) + 2 <p(b) and ( p(a ■ 1 b) = <p(a) - 2 ip(b) for all a,b £ S. 

ideal : a subring of a ring that is closed under left and right multiplication by elements 
of the ring. 

identity: an element e in an algebraic structure S such that e * a = a * e = a for all 
a £ S. 

improper subgroups (of G ): the subgroups G and {e}. 

index of H in G: the number of left (or tight) cosets of H in G. 

integral domain: a commutative ring with unity that has no zero divisors. 

inverse of an element a: an element a' such that a * a’ = a’ * a = e. 

involution: a function that is the identity when it is composed with itself. 

irreducible element in a ring: a noninvertible element that cannot be written as 
the product of noninvertible elements. 

irreducible polynomial: a polynomial p(x) of degree n > 0 over a field that cannot 
be written as pi(x) ■ p^ix) where pi(x) and P2(x ) are polynomials of smaller degrees. 
Otherwise p(x) is reducible. 

isomorphic: property of algebraic structures of the same type, G and H , that there 
is an isomorphism from G onto H , written G = H. 

isomorphism: a one-to-one and onto function between two algebraic structures that 
preserves the operations on the structures. 

isomorphism of groups: for groups (Gi,*J and ( C-2 ■ * 2 ) ■ a function p:G\ — » G2 
that is one-to-one, onto G2, and satisfies the property p(a-k 1 b) = tp(a) * 2 1 p(b). 

isomorphism of permutation groups : for permutation groups (G,X) and ( H,Y ), 
a pair of functions (a: G^H 1 f: Y—>Y) such that a is a group isomorphism and f is 
a bijection. 

isomorphism of rings: for rings (i?i, +„ , •„) and (R2, + 2 , - 2 ), a function ip: R\ — > R2 
that is one-to-one, onto R2, and satisfies the properties ^(o+i b ) = ip(a)+ 2 ip(b) and 
<p(a -i b) = p(a) - 2 <p(b). 

kernel (of a group homomorphism): given a group homomorphism ip, the set ip^ 1 (e) = 
{ x | (p(x) = e }, where e is the group identity. 

kernel (of a ring homomorphism): given a ring homomorphism tp, the set ^j _ 1 (0) = 
{x | ip(x) = 0}. 
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Klein four-group: the group under composition of the four rigid motions of a rect- 
angle that leave the rectangle in its original location. 

lattice: a nonempty partially ordered set in which inf{o, b} and sup{a, b} exist for all 
a, b. (a V b = sup{a, 6}, a A b = inf{a, b}.) Equivalently, a nonempty set closed under 
two binary operations V and A that satisfy the associative laws, the commutative 
laws, and the absorption laws (a V (a A b) = a, a A (a V b) = a) . 

left divisor of zero: a/0 with b ^ 0 such that ab = 0. 

literal: a Boolean variable or its complement. 

maximal ideal: an ideal in a ring R that is not properly contained in any ideal of R 
except R itself. 

maxterm of the Boolean variables X\ . . , x n : a sum of the form yi + ■ ■ ■ + y n 
where for each i, yi is equal to x* or x\. 

minimal polynomial (of an element with respect to a field): given a field F and 
a € F, the monic irreducible polynomial /(x) £ E[x] of smallest degree such that 
/(<*) = 0 . 

minterm of the Boolean variables x \ . . . . , x n : a product of the form y \ y n 

where for each i, yi is equal to x* or x[. 

monadic operation : a function from a set into itself. 

monoid: an algebraic structure (S,*) such that * is associative and S has an identity. 

normal extension of F: a field K such that K/F is algebraic and every irreducible 
polynomial in F[x\ with a root in K has all its roots in K ( splits in K). 

normal subgroup (of a group): given a group G, a subgroup H C G such that 
aH = Ha for all a £ G. 

octic group: See dihedral group. 

odd permutation : a permutation that can be written as a product of an odd number 
of transpositions. 

orbit (of an object a £ S under permutation a): {. . . , aa ~ 2 , aa _1 , a, aa , aa 2 , . . .}. 

order (of an algebraic structure): the number of elements in the underlying set. 

order (of a group element): for an element a £ G, the smallest positive integer n such 
that a n = e ( na = 0 if G is written additively). If there is no such integer, then a 
has infinite order. 

p-group : for prime p, a group such that every element has a power of p as its order. 

permutation: a one-to-one and onto function a: S — > S, where S is any nonempty set. 

permutation group: a collection of permutations on a set of objects that form a 
group under composition. 

polynomial (in the variable x over a ring): an expression of the form p{x) = a n x n + 
a n _ iX n_1 + • • • + aiX 1 + Oox° where a n , . . . , ao are elements of the ring. For a 
polynomial p(x), the largest integer k such that ^ 0 is the degree of p(x). 
The constant polynomial p(x) = ao has degree 0, if ao ^ 0. If p(x) = 0 ( zero 
polynomial ), the degree of p(x) is undefined (or — oo). 

polynomial ring (over a ring R): R[x] = {p(x) | p{x) is a polynomial in x over I?} 
with the usual definitions of addition and multiplication. 

prime ideal (of a ring R) : an ideal I ^ R with property that ab £ I implies that a £ I 
or b £ I. 
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proper subgroup (of a group G ): any subgroup of G except G and {e}. 

quotient group ( factor group): for normal subgroup H of G, the group G/H = 
{ aH | a € G }, where aH ■ bH = ( ab)H . 

quotient ring: for / an ideal in a ring R , the ring R/I = { a + I \ a £ R}, where 
(a + I) + (b + I) = (a + b) + I and (a + I) ■ (b + I) = ( ab ) + I. 

reducible (polynomial): a polynomial that is not irreducible. 

right divisor of zero: b ^ 0 with a / 0 such that ab = 0. 

ring: an algebraic structure (R, +, •) where R is a set closed under two binary op- 
erations + and • , (i?, +) is an abelian group, R satisfies the associative law for 
multiplication, and R satisfies the left and right distributive laws for multiplication 
over addition. 

ring with unity : a ring with an identity for multiplication. 

root field : a splitting field. 

semigroup: an algebraic structure ( S , *) where S' is a nonempty set that is closed 
under the associative binary operation *. 

separable extension (of field F ): a field K such that every element of K is the root 
of a separable polynomial in F\x\. 

separable polynomial: a polynomial p(x) £ F[x\ of degree n that has n distinct 
roots in its splitting field. 

sign (of a permutation): the value +1 if the permutation has an even number of trans- 
positions when the permutation is written as a product of transpositions, and — 1 
otherwise. 

simple group: a group whose only normal subgroups are {e} and G. 

skew field: a division ring. 

splitting field (for nonconstant p(x) £ F[a;]): the field K = F(a i,...,a n ) where 
p(x) = a{x — «i ) . . . (x — a n ), a £ F. 

subfield (of a field K ): a subset F C K that is a field using the same operations used 
in K. 

subgroup (of a group G): a subset H C G such that H is a group using the same 
group operation used in G. 

subgroup generated by {a,; | * £ S}: for a given group G where Oj £ G for all i 
in S, the smallest subgroup of G containing { a* | i £ S } . 

subring (of a ring R): a subset SCR that is a ring using the same operations used 
in R. 

Sylow p-subgroup (of G ): a subgroup of G that is a p-group and is not properly 
contained in any p-group of G. 

symmetric group: the group of all permutations on {1, 2, ... , n} under the operation 
of composition. 

transcendental element (over a field F): given a field F and an extension field K, 
an element of K that is not a root of any nonzero polynomial in F\x\. 

transposition: a cycle of length 2. 

unary operation: See monadic operation. 

unit (in a ring): an element with a multiplicative inverse in the ring. 
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unity (in a ring): a multiplicative identity not equal to 0. 
word (on a set): a finite sequence of elements of the set. 
zero (of a polynomial /): an element a such that /(a) = 0. 


5.1 ALGEBRAIC MODELS 


5.1 .1 DOMAINS AND OPERATIONS 
Definitions: 

An n-ary operation on a set S is a function -k:SxSx---xS—+S, where the domain 
is the product of n factors. 

A binary operation on a set S' is a function ★: S x S — > S. 

A monadic operation (or unary operation ) on a set S is a function *: S — > S. 

An algebraic structure (S, ■ ■ ■ , *„) consists of a nonempty set S (the domain ) 
with one or more n-ary operations defined on S. 

A binary operation can have some of the following properties: 

• associative property, a* (b* c) = (a * b) * c for all a, b, c £ S; 

• existence of an identity element: there is an element e £ S such that 

e * a = a * e = a for all a £ S (e is an identity for S); 

• existence of inverses: for each element a £ S there is an element a' £ S such 

that a' x a = ax a' = e (a' is an inverse of a); 

• commutative property: a * b = b * a for all a,b £ S. 


Examples: 

1. The most important types of algebraic structures with one binary operation are 
listed in the following table. A checkmark means that the property holds. 



closed 

associative 

commutative 

existence 
of identity 

existence 
of inverses 

semigroup 

V 

V 




monoid 

V 

V 




group 

V 

V 


V 

V 

abelian group 

V 

V 

V 

V 

V 


5.1 .2 SEMIGROUPS AND MONOIDS 
Definitions: 

A semigroup (S, *) consists of a nonempty set S and an associative binary operation * 
on S. 

A monoid (S, *) consists of a nonempty set S and an associative binary operation * 
on S such that S has an identity. 
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A nonempty subset T of a semigroup (/S',*) is a subsemigroup of S if T is closed 
under *. 

A subset T of a monoid ( S , *) with identity e is a submonoid of S if T is closed under * 
and e £ T. 

Two semigroups [monoids] (Si , *,) and ( S 2 ,* 2 ) are isomorphic if there is a function 
<p: Si — > S *2 that is one-to-one, onto S 2 , and such that ip(a * 1 b) = ip( a ) * 2 <p(b) for 
all a, b € Si. 

A word on a set S (the alphabet ) is a finite sequence of elements of S. 

The free monoid [free semigroup ] generated by S is the monoid [semigroup] (S’*,*) 
where S* is the set of all words on a set S and the operation * is defined on S* by 
concatenation: X\X2 ■ ■ ■ x m * yiy2 ■ ■ -Un = X1X2 ■ ■ ■ x m yiy2 ■ • • y n - (S'*,*) is also called 

the free monoid [free semigroup] on S*. 

Facts: 

1. Every monoid is a semigroup. 

2. Every semigroup (S, *) is isomorphic to a subsemigroup of some semigroup of trans- 
formations on some set. Hence, every semigroup can be regarded as a semigroup of 
transformations. An analogous result is true for monoids. 

Examples: 

1. Free semigroups and monoids: The free monoid generated by S' is a monoid with 
the empty word e = A (the sequence consisting of zero elements) as the identity. 

2 . The possible input tapes to a computer form a free monoid on the set of symbols 
(such as the ASCII symbols) in the computer alphabet. 

3 . Semigroup and monoid of transformations on a set S: Let S be a nonempty set and 
let T be the set of all functions /: S — > S. With the operation * defined by composition, 
(/ * d)( x ) — f(ff( x ))t (^ r , *) is the semigroup [monoid] of transformations on S. The 
identity of T is the identity transformation e: S — > S where e{x) = x for all x £ S. 

4. The set of closed walks based at a fixed vertex v in a graph forms a monoid under 
the operation of concatenation. The null walk is the identity. (§8.2.1.) 

5. For a fixed positive integer n, the set of all n x n matrices with elements in any ring 
with unity (§5.4.1) where * is matrix multiplication (using the operations in the ring) 
is a semigroup and a monoid. The identity is the identity matrix. 

6. The sets 

J\f = {0, 1, 2,3,.. .} (natural numbers), 

Z= {...,-2, -1,0, 1,2,...} (integers), 

Q (the set of rational numbers), 

1Z (the set of real numbers), 

C (the set of complex numbers), 

where * is either addition or multiplication, are all semigroups and monoids. Using 
either addition or multiplication, each semigroup is a subsemigroup of each of those 
following it in this list. Likewise, using either addition or multiplication, each monoid 
is a submonoid of each of those following it in this list. For example, (Q, +) is a 
subsemigroup and submonoid of (7^,+) and (C,+). Under addition, e = 0; under 
multiplication, e = 1. 
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5.2 GROUPS 


5.2.1 BASIC CONCEPTS 
Definitions: 

A group ( G , *) consists of a set G with a binary operator * defined on G such that * 
has the following properties: 

• associative property : a* (b* c) = (a * b) * c for all a, b, c £ G; 

• identity property : G has an element e ( identity of G) that satisfies e * a = 

a-k e = a for all a £ G; 

• inverse property: for each element a £ G there is an element a -1 £ G ( inverse 

of a) such that a -1 * a = a * a -1 = e. 

If a-kb = bka for all a,b £ G, the group G is commutative or abelian. (Niels H. Abel, 
1802 - 1829 ) 

The order of a finite group G, denoted |G|, is the number of elements in the group. 

The ( external ) direct product of groups (Gi,-*© and (G 2,* 2 ) is the group G i x 
Gi — {(01,02) | ai € Gi, 02 € G2 } where multiplication * is defined by the rule 
(01, 02) * (61, 62) = (oi b\, 02 * 2 62)- The direct product can be extended to n groups: 
Gi x G2 x • • • x G n . The direct product is also called the direct sum and written 
Gi © G2 © • • • © G n , especially if the groups are abelian. If G,; = G for all i, the direct 
product can be written G n . 

The group G is finitely generated if there are 01, 02, . . . , a n £ G such that every 
element of G can be written as o^o^ . . . a{ J where ki £ { 1 , . . . , n} and e, £ { 1 ,- 1 }, 
for some j > 0; where the empty product is defined to be e. 

Note: Frequently the operation * is multiplication or addition. If the operation is 
addition, the group (G, +) is an additive group. If the operation is multiplication, the 
group (G, •) is a multiplicative group. 



operation * 

identity e 

inverse a 1 

additive group 

a + b 

0 

—a 

multiplicative group 

a ■ b or ab 

1 or e 

a- 1 


Facts: 

1 . Every group has exactly one identity element. 

2 . In every group every element has exactly one inverse. 

3 . Cancellation laws: In all groups, 

• if ab = ac then b = c ( left cancellation law); 

• if ba = ca, then b = c ( right cancellation law) . 

4. (a -1 ) -1 = a. 

5 . (afo© 1 = b~ 1 a~ 1 . More generally, (0402 . . . a*,) -1 = c© 1(l k-i • • ■ a i ■ 

6. If a and b are elements of a group G, the equations ax = b and xa = b have unique 
solutions in G. The solutions are x = a~ x b and x = ba~ 1 , respectively. 

7 . The direct product G\ x • • • x G„ is abelian when each group Gi is abelian. 
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8. | G lX -x G n | = IGil |G„|. 

9. The identity for G\ x ■ • • x G n is (ei,. . . ,e n ) where is the identity of Gi. The 
inverse of (ai, . . . ,a„) is (ai, . . . , a n ) _1 = (a^ 1 , . . . ,a“ 1 ). 

10 . The structure of a group can be determined by a single rule (see Example 2) or by 
a group table listing all products (see Examples 2 and 3) . 

Examples: 

1. Table 1 displays information on several common groups. All groups listed have 
infinite order, except for the following: the group of complex nth roots of unity has 
order n, the group of all bijections f:S — > S where |Sj = n has order n!, Z n has 
order n, Z* has order <p(n) (Euler phi- function), S n has order n!, A n has order n\/ 2, D n 
has order 2 n, and the quaternion group has order 8. All groups listed in the table are 
abelian except for: the group of bijections, GL(n,lZ), S n , A n , D n , and Q. 

2. The groups Z n and Z* (see Table 1): In the groups Z n and Z* an element a can be 
viewed as the equivalence class { b € Z \ b mod n = a mod n }, which can be written a 
or [a]. To find the inverse a^ 1 of a £ Z*, use the extended Euclidean algorithm to find 
integers a^ 1 and k such that act -1 + nk = gcd(a, n) = 1. The following are the group 
tables for Z 2 = {0, 1} and Z 3 = {0, 1, 2}: 


3. Quaternion group: Q = {1,-1 ,i,—i,j,—j,k,—k} where multiplication is defined 
by the following relations: 

i 2 = j 2 = k 2 = —1, ij = —ji = k , jk = —kj = i, ki = —ik = j 
where 1 is the identity. These relations yield the following multiplication table: 



1 

-1 

i 

—i 

j 

-j 

k 

-k 

1 

1 

-1 

i 

—i 

j 

-j 

k 

-k 

-1 

-1 

1 

—i 

i 

~j 

j 

-k 

k 

i 

i 

—i 

-1 

1 

k 

-k 

-j 

j 

—i 

—i 

i 

1 

-1 

—k 

k 

j 


j 

j 

-j 

-k 

k 

-1 

1 

i 

—i 

-j 

- j 

j 

k 

-k 

1 

-1 

—i 

i 

k 

k 

-k 

j 


—i 

i 

-1 

1 

-k 

-k 

k 

~ j 

j 

i 

—i 

1 

-1 


Inverses: 1 1 = 1, (—1) 1 = — 1, x and —x are inverses for x = i,j,k. The group is 
nonabelian. 

The quaternion group Q can also be defined as the following group of 8 matrices: 

10\/-1 0 \ ( —i 0 \ f i 0 \ 

0 lj’ \ 0 -l)’ V 0 i )' ^0 -i)' 

0 1 \ fO - 1 \ fO i\ ( 0 -i\ 

-1 o)’ \i 0 )’ 

where i is the complex number such that i 2 = — 1 and the group operation is matrix 
multiplication. 


+ 

0 

1 

0 

0 

1 

1 

1 

0 


+ 

0 

1 

2 

0 

0 

1 

2 

1 

1 

2 

0 

2 

2 

0 

1 
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Table 1 Examples of groups. 


set 

operation 

identity 

inverses 

z, q, n, c 

addition 

0 

—a 

Z n , n a positive integer 
(also Q n , 7Z n , C n ) 

coordinatewise 

addition 

(0, ... ,0) 

~{a i, ...,a n ) = 
(-ai, . . . , —a n ) 

the set of all complex num- 
bers of modulus 1 = {e l6 = 
cos 9+i sin 6 | 0 < 9 < 2n} 

multiplication 

e i0 = 1 

(e w )~ 1 = e~ w 

the complex nth roots of 
unity (solutions to z n = 1) 
{e 27 Iik /n | fc = 0,l,...,n-l} 

multiplication 

1 

^2'Kik/n^ — l 

g27ri(n— k)/n 

^-{0}, Q— {0}, C-{0} 

multiplication 

1 

l/a 

1Z* (positive real numbers) 

multiplication 

1 

1/a 

all rotations of the plane 
around the origin; r a = 
counterclockwise rotation 
through an angle of a°: 
r a (x,y) = ©cos a — ysina, 
x sin a + y sin a) 

composition: 
r a2 ° r ai = 

? ai+a 2 

r o (the 0° 
rotation) 

r" 1 = r_ a 

all 1-1, onto functions (bijec- 
tions) /: S — > S where S is 
any nonempty set 

composition 
of functions 

i: S —> S 
where i(x) = x 
for all a; € S' 

f~ 1 (y ) = x if and 
only if f(x) = y 

A4 mX n = all m x n matrices 
with entries in TZ 

matrix 

addition 

Om xn (zero 
matrix) 

-A 

GL(n , TV) = all n x n invert- 
ible, or nonsingular, matrices 
with entries in TZ\ (the gene- 
ral linear group) 

matrix 

multiplication 

I n (identity 
matrix) 

A- 1 

Z n = {0,1, . . . ,n — 1} 

(a + b) mod n 

0 

n — a (a 0) 

-0 = 0 

Z* = {k | k € Z n , k relative- 
ly prime to n}, n > 1 

ab mod n 

1 

see Example 2 

S n = all permutations of 
{1,2,..., n}; ( symmetric 
group) (See §5.3.) 

composition of 
permutations 

identity 

permutation 

inverse 

permutation 

A n = all even permutations 
of {1, 2, ... , n}; ( alternating 
group) (See §5.3.) 

composition of 
permutations 

identity 

permutation 

inverse 

permutation 

D n = symmetries (rotations 
and reflections) of a regular 
n-gon; ( dihedral group) 

composition of 
functions 

rotation 
through 0° 

r" 1 = r_ a ; 
reflections are 
their own inverses 

Q = quaternion group (see 
Example 3) 
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4. The set {a, b , c, d } with either of the following multiplication tables is not a group. In 
the first case there is an identity, a, and each element has an inverse, but the associative 
law fails: (bc)d ^ b(cd). In the second case there is no identity (hence inverses are not 
defined) and the associative law fails. 



a 

b 

c 

d 

a 

a 

c 

b 

d 

b 

d 

b 

a 

c 

c 

b 

d 

c 

a 

d 

c 

a 

d 

b 



a 

b 

c 

d 

a 

a 

b 

c 

d 

b 

b 

d 

a 

c 

c 

c 

a 

b 

d 

d 

d 

c 

b 

a 


5.2.2 GROUP ISOMORPHISM AND HOMOMORPHISM 

Definitions: 

For groups G and H , a function ip:G — + H such that ip(ab) = ip(a)<p(b) for all a,b € G 
is a homomorphism. The notation a<p is sometimes used instead of ip(a). 

For groups G and H , a function ip: G — > II is an isomorphism from G to // if :p 
is a homomorphism that is 1-1 and onto H . In this case G is isomorphic to H , 
written G = H . 

An isomorphism ip:G G is an automorphism. 

The kernel of <p is the set { g £ G \ ip(g) — e }, where e is the identity of the group G. 

Facts: 

1. If yp is an isomorphism, i p^ 1 is an isomorphism. 

2. Isomorphism is an equivalence relation: G = G (reflexive); if G = H, then H = G 
(symmetric); if G = H and H = K, then G = K (transitive). 

3. If ip: G — » H is a homomorphism, then <p(G) is a group (a subgroup of H). 

4. If ip: G — > H is a homomorphism, then the kernel of ip is a group (a subgroup of G). 

5. If p is prime there is only one group of order p (up to isomorphism) , the group (Z p ,+) . 

6. Cayley’s theorem: If G is a finite group of order n, then G is isomorphic to a 
subgroup of the group S n of permutations on n objects. (Arthur Cayley, 1821-1895) 
The isomorphism is obtained by associating with each a £ G the map n a : G— >G with 
the rule it a (g) = ga for all g £ G. 

7. Z m x Z n is isomorphic to Z mn if and only if m and n are relatively prime. 

8. If n = ri\ri 2 . . . nu where the rq are powers of distinct primes, then Z n is isomorphic 
to Z ni x Z n2 x • • • x Z nk . 

9. Fundamental theorem of finite abelian groups: Every finite abelian group G (order 
> 2) is isomorphic to a direct product of cyclic groups where each cyclic group has order 
a power of a prime. That is, G is isomorphic to Z m x Z n2 x • • • x Z„ k where each cyclic 
order rq is a power of some prime. In addition, the set {ni, . . . , ilk} is unique. 

10. Every finite abelian group is isomorphic to a subgroup of Z* for some n. 

11. Fundamental theorem of finitely generated abelian groups: If G is a finitely gen- 
erated abelian group, then there are unique integers n > 0, ni,ri 2 , . . . ,rik > 2 where 
n.j+i | rii for i = 1, 2, . . . , k — 1 such that G is isomorphic to Z n x Z ni x Z n2 x • • • x Z nk . 
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Table 2 Numbers of groups and abelian groups. 


order 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 

groups 

1 

1 

1 

2 

1 

2 

1 

5 

2 

2 

1 

5 

1 

2 

1 

14 

1 

5 

1 

5 

abelian 

1 

1 

1 

2 

1 

1 

1 

3 

2 

1 

1 

2 

1 

1 

1 

5 

1 

2 

1 

2 

order 

21 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 

39 

40 

groups 

2 

2 

1 

15 

2 

2 

5 

4 

1 

4 

1 

51 

1 

2 

1 

14 

1 

2 

2 

14 

abelian 

1 

1 

1 

3 

2 

1 

3 

2 

1 

1 

1 

7 

1 

1 

1 

4 

1 

1 

1 

3 

order 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 

groups 

1 

6 

1 

4 

2 

2 

1 

52 

2 

5 

1 

5 

1 

15 

2 

13 

2 

2 

1 

13 

abelian 

1 

1 

1 

2 

2 

1 

1 

5 

2 

2 

1 

2 

1 

3 

1 

3 

1 

1 

1 

2 


Examples: 

1. Table 2 lists the number of nonisomorphic groups and abelian groups of all orders 
from 1 to 60. 

2. All groups of order 12 or less are listed by order in Table 3. 


5.2.3 SUBGROUPS 

Definitions: 

A subgroup of a group (G,*) is a subset H C G such that (fJ, *) is a group (with the 
same group operation as in G). Write H < G if H is a subgroup of G. 

If a G G, the set (a) = {. . . , a -2 = (a -1 ) 2 , a -1 , a 0 = e, a, a 2 , ...} = { a" | n £ Z } is the 
cyclic subgroup generated by a. The element a is a generator of G. 

G and {e} are improper subgroups of G. All other subgroups of G are proper 
subgroups of G. 

Facts: 

1. If G is a group, then {e} and G are subgroups of G. 

2. If G is a group and a £ G, the set (a) is a subgroup of G. 

3. Every subgroup of an abelian group is abelian. 

4. If H is a subgroup of a group G, then the identity element of H is the identity 
element of G; the inverse (in the subgroup H ) of an element a in H is the inverse (in 
the group G) of a. 

5. Lagrange’s theorem : Let G be a finite group. If H is any subgroup of G, then the 
order of H is a divisor of the order of G. (Joseph-Louis Lagrange, 1736-1813) 

6. If d is a divisor of the order of a group G, there may be no subgroup of order d. 
(The group A 4 , of order 12, has no subgroup of order 6. See §5.3.3.) 

7. If G is a finite abelian group, then the converse of Lagrange’s theorem is true for G. 

8. If G is finite (not necessarily abelian) and p is a prime that divides the order of G, 

then G has a subgroup of order p. 

9. If G has order p m n where p is prime and p does not divide n, then G has a subgroup 
of order p m , called a Sylow subgroup or Sylow p-subgroup. See §5.2.6. 
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Table 3 All groups of order 12 or less. 


order 

groups 

1 

{e} 

2 

Z 2 

3 

2 3 

4 

i? 4 , if there is an element of order 4 (group is cyclic) 

Z 2 x Z 2 = Klein four-group, if no element has order 4 (§5.3.2) 

5 

2 § 

6 

Zq, if there is an element of order 6 (group is cyclic) 

S 3 = D 3 , if there is no element of order 6 (§5.3.1, 5.3.2) 

7 

z 7 

8 

Zs, if there is an element of order 8 (group is cyclic) 

Z 2 x Z 4 , if there is an element a of order 4, but none of order 8 , and 
if there is an element b /£{a) such that ab = ba and b 2 = e 

Z 2 x Z 2 x Z 2 , if every element has order 1 or 2 

Z? 4 , if there is an element a of order 4, but none of order 8 , and if 
there is an element b /s(a) such that ba = a? b and b 2 = e 

Quaternion group, if there is an element a of order 4, none of order 8 , 
and an element b / £(a ) such that ba = a 3 b and b 2 = a 2 (§5.2.2) 

9 

Zq, if there is an element of order 9 (group is cyclic) 

Z 3 x Z 3 , if there is no element of order 9 

10 

i?io, if there is an element of order 10 (group is cyclic) 

D 5 , if there is no element of order 10 

11 

- 2-11 

12 

Zyi = Z 3 x Z 4 , if there is an element of order 12 (group is cyclic) 

Z 2 x Zq = Z 2 x Z 2 x Z 3 , if group is abelian but noncyclic 

Dq, if group is nonabelian and has an element of order 6 but none of 
order 4 

A 4 , if group is nonabelian and has no element of order 6 

The group generated by a and 6, where a has order 4, b has order 3, 
and ab = b 2 a 


10 . A subset H of a group G is a subgroup of G if and only if the following are all 
true: H ^ 0; a,b £ H implies ab £ H\ and a £ H implies a -1 £ H. 

11. A subset H of a group G is a subgroup of G if and only if H ^ 0 and a,b £ H 
implies that ab -1 £ H. 

12. If H is a nonempty finite subset of a group G with the property that a, b £ H 
implies that ab £ H , then H is a subgroup of G. 

13. The intersection of any collection of subgroups of a group G is a subgroup of G. 

14. The union of subgroups is not necessarily a subgroup. See Example 12. 
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Examples: 

1. Additive subgroups : Each of the following can be viewed as a subgroup of all the 
groups listed after it: (Z,+), (Q, +), (7?.,+), (C,+). 

2. For n any positive integer, the set nZ = {nz \ z G Zj is a subgroup of Z. 

3. Z 2 is not a subgroup of Z 4 (the group operations are not the same). 

4. The set of odd integers under addition is not a subgroup of ( Z,+ ) (the set of odd 
integers is not closed under addition). 

5. (A f,+) is not a subgroup of (Z,+) (A f does not contain its inverses). 

6. The group Z e has the following four subgroups: {0}, {0,3}, {0,2,4}, Zq. 

7. Multiplicative subgroups: Each of the following can be viewed as a subgroup of all 
the groups listed after it: (Q — {0}, •), (7Z — {0}, •), (C — {0}, •). 

8. The set of n complex nth roots of unity can be viewed as a subgroup of the set of all 
complex numbers of modulus 1 under multiplication, which is a subgroup of (C — {0}, •). 

9. If nd = 360 (n and d positive integers) and rp, is the counterclockwise rotation of the 

plane about the origin through an angle of k°, then { \ k = 0, d, 2d, 3d, . . . , (n — 1 )d } 

is a subgroup of the group of all rotations of the plane around the origin. 

10. The set of all nxn nonsingular diagonal matrices is a subgroup of the set of all n x n 
nonsingular matrices under multiplication. 

11. If n = mk, then {0,m,2m,...,(fc — l)m} is a subgroup of (Z n ,+) isomorphic 
to Zf~. 

12. The union of subgroups need not be a subgroup: { 2n | n € Z } and { 3n \ n € Z } 
are subgroups of Z, but their union is not a subgroup of Z since 2 + 3 = 5^{2n|?rG 
Z } U { 3n | n G Z }. 


5.2.4 COSETS AND QUOTIENT GROUPS 


Definitions: 

If H is a subgroup of a group G and a G G, then the set aH = {ah \ h G H} is a left 
coset of H in G. The set Ha = { ha \ h G H } is a right coset of H in G. (If G is 
written additively, the cosets are written a + H and H + a.) 

The index of a subgroup H in a group G, written ( G:H ) or [ G:H ], is the number of 
left (or right) cosets of H in G. 

A normal subgroup of a group G is a subgroup H of G such that aH = Ha for all 
a G G. The notation H <\G means that H is a normal subgroup of G. 

If II is a. normal subgroup of G, the quotient group (or factor group of G mod- 
ulo H) is the group G/H = { aH \ a G G }, where aH ■ bH = ( ab)H . 

If G is a group and a G G, an element b £ G is a conjugate of a if b = gag -1 for 
some g G G. 

If G is a group and a G G, the set { x \ x G G, ax = xa } is the centralizer (or 
normalizer) of a. 

If G is a group, the set { x \ x G G, gx = xg for all g G G } is the center of G. 

If H is a subgroup of group G, the set { x \ x G G, xHx~ x = H } is the normalizer 
of H. 
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Facts: 

1. If H is a subgroup of a group G, then the following are equivalent: 

• H is a normal subgroup of G; 

• aHa -1 = a~ 1 Ha = H for all a £ G; 

• a~ 1 ha £ H for all a £ G, h £ H] 

• for all a £ G and hi £ H, there is h 2 £ H such that ahi = h 2 a. 

2. If group G is abelian, then every subgroup H of G is normal. If G is not abelian, it 
may happen that H is not normal. 

3 . If group G is finite, then ( G:H ) = |G|/|i4|. 

4. {e} and G are normal subgroups of group G. 

5 . In the group G/H , the identity is eH = H and the inverse of aH is a ~ 1 H. 

6. Fundamental homomorphism theorem: If ip: G — » H is a homomorphism and has 
kernel K, then K is a normal subgroup of G and G/K is isomorphic to <p(G). 

7. If H is a normal subgroup of a group G and p: G — > G/H is defined by p(g) = gH, 
then ip is a homomorphism onto G/H with kernel H. 

8. If H is a normal subgroup of a finite group G, then G/H has |G|/|ff| cosets. 

9. If H and K are normal subgroups of a group G, then H D K is a normal subgroup 
of G. 

10 . For all a £ G, the centralizer of a is a subgroup of G. 

11 . The center of a group is a subgroup of the group. 

12. The normalizer of a subgroup of group G is a subgroup of G. 

13 . The index of the centralizer of a £ G is equal to the number of distinct conjugates 
of a in G. 

14 . If a group G contains normal subgroups H and K such that H D K = {e} and 
{hk | h £ H, k £ K} = G, then G is isomorphic to H x K. 

15 . If G is a group such that \G\ = ab where a and b are relatively prime, and if G 
contains normal subgroups H of order a and K of order b , then G is isomorphic to 
H x K. 

Examples: 

1. Z/nZ is isomorphic to Z n , since <p: Z — > Z n defined by p{g) = g mod n has kernel 
nZ. 

2 . The left cosets of the subgroup H = {0,4} in Z 8 are H + 0 = {0,4}, H + 1 = {1,5}, 
H + 2 = {2, 6}, H + 3 = {3, 7}. The index of H in Z 8 is (Z 8 , H) = 4. 

3 . {(1), (12)} is not a normal subgroup of the symmetric group S 3 (§5.3.1). 


5.2.5 CYCLIC GROUPS AND ORDER 
Definitions: 

A group (G, •) is cyclic if there is a £ G such that G={a” | n £ Z}, where a 0 = e and 
a~ n = (o -1 )" for all positive integers n. If G is written additively, G = { na \ n £ Z }, 
where 0a = 0 and if n > 0, na = a + a + a + ■ ■ ■ + a (n terms) and —na = (—a) + (—a) + 
• • • + (—a) (n terms). 

The element a is called a generator of G and the group (G, •) is written ((a), •), (a), 
or (a). 
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The order of an element a £ G, written |(a)| or ord(a), is the smallest positive 
integer n such that a n = e ( na = 0 if G is written additively). If there is no such 
integer, then a has infinite order. 

A subgroup 17 of a group (G, •) is a cyclic subgroup if there is a £ H such that 
H = {a n | n £ Zj. 


Facts: 

1. The order of an element a is equal to the number of elements in (a). 

2 . Every group of prime order is cyclic. 

3 . Every cyclic group is abelian. However, not every abelian group is cyclic; for example 
(1Z, +) and the Klein four-group. 

4 . If G is an infinite cyclic group, then G = (Z, +). 

5. If G is a finite cyclic group of order n, then G = (Z n , +). 

6. If G is a group of order n, then the order of every element of G is a divisor of n. 

7. Cauchy’s theorem : If G is a group of order n and p is a prime that divides n, then G 
contains an element of order p. (Augustin-Louis Cauchy, 1789-1857) 

8. If G is a cyclic group of order n generated by a, then G = {a, a 2 , a 3 , . . . , a”} and 
a n = e. If k and n are relatively prime, then a k is also a generator of G, and conversely. 

9. If G is a group and a £ G, then (a) is a cyclic subgroup of G. 

10. Every subgroup of a cyclic group is cyclic. 

11. If G is a group of order n and there is an element a £ G of order n, then G is cyclic 
and G = (a). 

Examples: 

1. (Z, +) is cyclic and is generated by each of 1 and —1. 

2 . (Z n ,+) is cyclic and is generated by each element of Z n that is relatively prime 
to n. If a £ Z n , then a has order n/gcd(a,n). 

3 . (Z p , +), p prime, is a cyclic group generated by each of the elements 1, 2, . . . ,p — 1. 
If a ^ 0, a has order p. 

4 . ( Z *, •) is cyclic if and only if n = 2, 4, p k , or 2 p k , where k > 1 and p is an odd 
prime. 


5.2.6 SYLOW THEORY 

The Sylow theorems are used to help classify the nonisomorphic groups of a given order 
by guaranteeing the existence of subgroups of certain orders. (Peter Ludvig Mejdell 
Sylow, 1832-1918) 

Definitions: 

For prime p, a group G is a p- group if every element of G has order p n for some positive 
integer n. 

For prime p. a Sylow p-subgroup ( Sylow subgroup ) of G is a subgroup of G that 
is a p-group and is not properly contained in any p-group in G. 
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Facts: 

1. Sylow theorem: If G is a group of order p m ■ q where p is a prime, m > 1, and p\q , 
then: 

• G contains subgroups of orders p,p 2 , . . . ,p m (hence, if prime p divides the order 

of a finite group G, then G contains an element of order p); 

• if H and K are Sylow p-subgroups of G, there is g £ G such that K = gHg 

( K is conjugate to H)\ 

• the number of Sylow p-subgroups of G is kp + 1 for some integer k such that 

( kp + 1) | q. 

2. If G is a group of order pq where p and q are primes and p < q, then G contains a 
normal subgroup of order q. 

3 . If G is a group of order pq where p and q are primes, p < q, and pj(q — 1 ), then G 
is cyclic. 

Examples: 

1. Every group of order 15 is cyclic (by Fact 3). 

2 . Every group of order 21 contains a normal subgroup of order 7 (by Fact 2). 


5.2.7 SIMPLE GROUPS 

Simple groups arise as a fundamental part of the study of finite groups and the structure 
of their subgroups. An extensive, lengthy search by many mathematicians for all finite 
simple groups ended in 1980 when, as the result of hundreds of articles written by over 
one hundred mathematicians, the classification of all finite simple groups was completed. 
See [As86] and [Go82] for details. 

Definitions: 

A group G/{e} is simple if its only normal subgroups are {e} and G. 

A composition series for a group G is a finite sequence of subgroups Hi = G, H 2 , . . . , 
H n _ i, H n = {e} such that H i+ 1 is a normal subgroup of Hi and Hi/H i+ i is simple, for 
i = 1 , . . . , n — 1 . 

A finite group G is solvable if it has a sequence of subgroups H t = G, H 2 , . H n _ i, 
H n = {e} such that H r+ i is a normal subgroup of Hi and Hi/Hi + \ is abelian, for 
i = 1 , fTi — 1 . 

A sporadic group is one of 26 nonabelian finite simple groups that is not an alternating 
group or a group of Lie type [Go82] . 

Facts: 

1. Every finite group has a composition series. Thus, simple groups (the quotient 
groups in the series) can be regarded as the building blocks of finite groups. 

2. Some infinite groups, such as (Z,+), do not have composition series. 

3. Every abelian group is solvable. 

4. An abelian group G is simple if and only if G = Z p where p is prime. 

5. If G is a nonabelian solvable group, then G is not simple. 

6. Every group of prime order is simple. 

7. Every group of order p n (p prime) is solvable. 
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8. Every group of order p n q m {p, q primes) is solvable. 

9 . If G is a solvable, simple finite group, then G is either {e} or Z p (p prime). 

10 . If G is a simple group of odd order, then G = Z p for some prime p. 

11 . There is no infinite simple, solvable group. 

12 . Burnside conjecture/Feit-Thompson theorem: In 1911 William Burnside conjec- 
tured that all groups of odd order are solvable. This conjecture was proved in 1963 by 
Walter Feit and John Thompson. (See Fact 13.) 

13 . Every nonabelian simple group has even order. (This follows from the Feit- 
Thompson theorem.) 

14 . The proof of the Burnside conjecture provided the impetus for a massive program 
to classify all finite simple groups. This program, organized by Daniel Gorenstein, led 
to hundreds of journal articles and concluded in 1980 when the classification problem 
was finally solved (Fact 15). [GoLySo94] 

15 . Classification theorem for Unite simple groups: Every finite simple group is of one 
of the following types: 

• abelian: Z p where p is prime (§5.2.1); 

• nonabelian: 

o alternating groups A n (n ^ 4) (§5.3.2); 

o groups of Lie type, which fall into 6 classes of classical groups and 10 classes 
of exceptional simple groups [Ca72]; 

o sporadic groups. There are 26 sporadic groups, listed here from smallest to 
largest order. The letters in the names of the groups reflect the names of 
some of the people who conjectured the existence of the groups or proved 
the groups simple. Mu (order 7,920), M 12 , M 2 2 , M 23 , M 2 4 , Ji, Ji, J 3 , 
J 4 , HS, Me, Suz, Ru , He, Ly, ON, .1, .2, .3, M( 22), M(23), M(24)', F 5 , 
F 3 , F 2 , Fi ( the monster or Fischer- Griess group of order « 10 54 ). 


5.2.8 GROUP PRESENTATIONS 

Definitions: 

The balanced alphabet on the set X = {x\, . . . , x„ } is the set {x±, aq , . . . , x n , a:” 1 }, 
whose elements are often called symbols. 

Symbols Xj and xj 1 of a balanced alphabet are inverses of each other. A double 
inverse (rj 1 © 1 is understood as the identity operator. 

A word in X is a string S 1 S 2 . . . s n of symbols from the balanced alphabet on X. 

The inverse of a word s = sis 2 ■ ■ ■ s n is the word s _1 = s” 1 . . . 

The free semigroup W(X) has the set of words in X as its domain and string con- 
catenation as its product operation. 

A trivial relator in the set X = {aq, . . . , x n } is a word of the form XjXj 1 or xJ x Xj. 

A word u is freely equivalent to a word v, denoted u ~ v, if v can be obtained from u 
by iteratively inserting and deleting trivial relators, in the usual sense of those string 
operations. This is an equivalence relation, whose classes are called free equivalence 
classes. 

A reduced word is a word containing no instances of a trivial relator as a substring. 
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The free group F[X] has the set of free equivalence classes of words in X as its domain 
and class concatenation as its product operation. 

A group presentation is a pair (X: R ), where X is an alphabet and R is a set of words 
in X called relators. A group presentation is finite if X and R are both finite. 

A word u is R-equivalent to a word v under the group presentation (X: R), denoted 
u v, if v can be obtained from u by iteratively inserting and deleting relators 
from R or trivial relators. This is an equivalence relation, whose classes are called 

R-equivalence classes. 

The group Q(X:R) presented by the group presentation ( X:R ) has the set of in- 
equivalence classes as its domain and class concatenation as its product operation. 
Moreover, any group G isomorphic to Q (X : R) is said to be presented by the group 
presentation (X:R). 

The group G is finitely presentable if it has a presentation whose alphabet and relator 
set are both finite. 

The commutator of the words u and v is the word u^ 1 v~ 1 uv. Any word of this form 
is called a commutator. 

A conjugate of the word v is any word of the form u,~ 1 vu. 

Facts: 

1. Max Dehn (1911) formulated three fundamental decision problems for finite presen- 
tations: 

• word problem: Given an arbitrary presentation (X:i?) and an arbitrary word 

w, decide whether w is equivalent to the empty word (i.e., the group identity). 

• conjugacy problem: Given an arbitrary presentation (X:i?) and two arbitrary 

words Wi and W2, decide whether w\ is equivalent to a conjugate of W2- 

• isomorphism problem: Given two arbitrary presentations (X:i?) and (Y:S), 

decide whether they present isomorphic groups. 

2. W. W. Boone (1955) and P. S. Novikov (1955) constructed presentations in which 
the word problem is recursively unsol vable. This implies that there is no single finite 
procedure that works for all finite presentations, thereby negatively solving Dehn’s word 
problem and conjugacy problem. 

3 . M. O. Rabin (1958) proved that it is impossible to decide even whether a presentation 
presents the trivial group, which immediately implies that Dehn’s isomorphism problem 
is recursively unsol vable. 

4 . The word problem is recursively solvable in various special classes of group pre- 
sentations, including the following: presentations with no relators (i.e., free groups), 
presentations with only one relator, presentations in which the relator set includes the 
commutator of each pair of generators (i.e., abelian groups). 

5 . The group presentation Q(X: R) is the quotient of the free group F[X] by the nor- 
malizer of the relator set R. 

6. More information on group presentations can be found in [CoMo72], [CrFo63], and 
[MaKaSo65] . 

Examples: 

1. The cyclic group Z k has the presentation (x: x k ). 

2 . The direct sum Z r ® Z s has the presentation (a;, y: x r , y s , x~ 1 y~ l xy). 

3 . The dihedral group T> q has the presentation (x, y: x q , y 2 , y~ 1 xyx). 
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5.3 PERMUTATION GROUPS 


Permutations, as arrangements, are important tools used extensively in combinatorics 
(§2.3 and §2.7). The set of permutations on a given set forms a group, and it is this 
algebraic structure that is examined in this section. 


5.3.1 BASIC CONCEPTS 
Definitions: 

A permutation is a one-to-one and onto function cr: S — > S, where S is any nonempty 
set. If S = {oi, 02, ... , a n }, a permutation cr is sometimes written as the 2 x n matrix 

a = f ax a 2 ... a n \ 

\aicr a 2 a . . . a n a J 

where a^cr means cr(a,). 

A permutation a: S — > S is a cycle of length n if there is a subset of S of size n, 
{ai,a 2 , . . . , a n }, such that aicr = a 2 , a 2 a = a 3, . . . , a n a = 01, and acr = a for all other 
elements of S. Write a = (a± a 2 ... a n ). A transposition is a cycle of length 2. 

A permutation group ( G , X) is a collection G of permutations on a nonempty set X 
(whose elements are called objects ) such that these permutations form a group under 
composition. That is, if cr and r are permutations in G, ar is the permutation in G 
defined by the rule a(crr) = (acr)r. The order of the permutation group is \G\ . The 
degree of the permutation group is \X\. 

The symmetric group on n elements is the group S n of all permutations on the set 
{1,2 ,..., n} under composition. (See Fact 1.) 

An isomorphism from a permutation group (G, X) to a permutation group ( H , Y) is 
a pair of functions (cc: G^>H, f: X— >Y) such that a is a group isomorphism and / is 
one-to-one and onto Y . 

If a 1 = (a^ cii 2 ... di m ) and a 2 = {ay aj 2 . . . aj n ) are cycles on S, then cr 1 and (J2 are 
disjoint cycles if the sets {ay , ay , . . . , di m } and {a^ , a j2 , . . . , dj n } are disjoint. 

An even permutation [odd permutation] is a permutation that can be written as 
a product of an even [odd] number of transpositions. 

The sign of a permutation (where the permutation is written as a product of transpo- 
sitions) is +1 if it has an even number of transpositions and — 1 if it has an odd number 
of transpositions. 

The identity permutation on S is the permutation l: S — > S such that xl = x for all 
x £ S. 

An involution is a permutation a such that a 2 = 1 (the identity permutation). 

The orbit of a G S' under a is the set {. . . , aa~ 2 , an -1 , a, acr, acr 2 , . . 

Facts: 

1. Symmetric group of degree n: The set of permutations on a nonempty set X is a 
group, where the group operation is composition of permutations: crier 2 is defined by 
x(cricr 2 ) = {xcfi)< 7 2 . The identity is the identity permutation l. The inverse of cr is the 
permutation cr^ 1 , where xa ~ 1 = y if and only if ya = x. If |X| = n, the group of 
permutations is written S n , the symmetric group of degree n. 
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2 . Multiplication of permutations is not commutative. (See Examples 1 and 4.) 

3. A permutation tt is an involution if and only if tt = 7r _1 . 

4 . The number of involutions in S n , denoted inv(?i), is equal to the number of Young 
tableaux that can be formed from the set {1,2,..., n}. (See §2.8.) 

5 . Permutations can be used to find determinants of matrices. (See §6.3.) 

6. Every permutation on a finite set can be written as a product of disjoint cycles. 

7 . Cycle notation is not unique: for example, (1 4 7 5) = (4 7 5 1) = (7 5 1 4) = (5 1 4 7). 

8. Every permutation is either even or odd, and no permutation is both even and odd. 
Hence, every permutation has a unique sign. 

9 . Each cycle of length k can be written as a product of k — 1 transpositions: 

(aq x 2 x 3 ... x k ) = (xi x 2 )(aq aj 3 )(a:i x^) ... (aq x k ). 


10. S n has order n\. 

11 . S n is not abelian for n > 3. For example, (1 2)(1 3) ^ (1 3)(1 2). 

12 . The order of a permutation that is a single cycle is the length of the cycle. For 
example, (1 5 4) has order 3. 

13 . The order of a permutation that is written as a product of disjoint cycles is equal 
to the least common multiple of the lengths of the cycles. 


14 . Cayley’s theorem : If G is a finite group of order n, then G is isomorphic to a 
subgroup of S n . (See §5.2.2.) 

15 . Let G be a group of permutations on a set X (such a group is said to act on X). 
Then G induces an equivalence relation R on the set X by the following rule: for 
a, b G X, aRb if and only if there is a permutation o € G such that aa = b. 


Examples: 

1. If cr = 


, then or = 


and to = 


. Note that ot ^ to. 


1234 5\ = / 1 2 3 4 5 

5 1 2 4 3 ) 1 T \45132 
1 2 3 4 5 
4 3 5 2 1 

2. All elements of S n can be written in cycle notation. For example, 
1 2 3 4 5 6 7 

4 6 3 7 1 2 5 


= (1 4 7 5)(2 6) (3). 


3 4 5) 
5 3 1 J 


Each cycle describes the orbit of the elements in that cycle. For example, (1 4 7 5) 
is a cycle of length 4, and indicates that lo = 4, 4o = 7, 7 o = 5, and 5cr = 1. The 
cycle (3) indicates that 3o = 3. If a cycle has length 1, that cycle can be omitted when 
a permutation is written as a product of cycles: (1 4 7 5)(2 6)(3) = (1 4 7 5)(2 6). 

3 . Multiplication of permutations written in cycle notation can be performed easily. 
For example: if o = (1 5 3 2) and r = (1 4 3)(2 5), then ot = (1 5 3 2)(1 4 3)(2 5) = 
(1 2 4 3 5). (Moving from left to right through the product of cycles, trace the orbit of 
each element. For example, 3er = 2 and 2 r = 5; therefore 3ot = 5.) 

4 . Multiplication of cycles need not be commutative. For example, (1 2)(1 3) = (1 2 3), 
(1 3)(1 2) = (1 3 2), but (1 2 3) ^ (1 3 2). However, disjoint cycles commute. 

5 . If the group of permutations G = {(, (1 2), (3 5)} acts on the set S = {1, 2, 3, 4, 5}, 
then the partition of S resulting from the equivalence relation induced by G is {{1,2}, 
{3,5}, {4}}. (See Fact 15.) 

6. Let group G = {*,, (1 2)} act on X = {1,2} and group H = {t, (1 2)(3)} act on 
Y = {1,2,3}. The permutation groups (G,X) and (H,Y) are not isomorphic since 
there is no bijection between X and Y (even though G and H are isomorphic groups). 
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5.3.2 EXAMPLES OF PERMUTATION GROUPS 


Definitions: 

The alternating group on n elements (n > 2) is the subgroup A n of S n consisting of 
all even permutations. 

The dihedral group ( octic group ) D n is the group of rigid motions (rotations and 
reflections) of a regular polygon with n sides under composition. 

The Klein four-group (or Viergruppe or the group of the rectangle) is the group 
under composition of the four rigid motions of a rectangle that leave the rectangle in 
its original location. (Felix Klein, 1849-1925) 

Given a permutation a: S — > S, the induced pair permutation is the permutation <v' 2) 
on unordered pairs of elements of S given by the rule er ^({x,y}) = {cr{x), cr(y)}. 

Given a permutation group G acting on a set S, the induced pair-action group G >r> 
is the group of induced pair-permutations { ed 2 '* | a £ G } under composition. 

Given a permutation n\ S — > S, the ordered pair-permutation is the permutation cd 2 1 
on the set S' x S' given by the rule cr^((x,y)) = ( a(x),cr(y )). 

Given a permutation group G acting on a set S, the ordered pair-action group bd 2 l 
is the group of ordered pair-permutations { cd 2 i | a £ G } under composition. 


Facts: 

1. Some common subgroups of S n are listed in the following table. 


subgroup 

order 

description 

symmetric group S n 

n\ 

all permutations of {1, 2, . . . , n} 

alternating group A n 

n!/2 

all even permutations of {1, 2, . . . ,n} 

dihedral group D n 

2 n 

rigid motions of regular n-gon in 
3-dimensional space (Example 2) 

Klein 4-group 
(subgroup of S© 

4 

rigid motions of rectangle in 

3-dimensional space (Example 3) 

identity 

1 

consists only of identity permutation 


2. The group A n is abelian if n = 2 or 3, and is nonabelian if n > 4. 

3. The group D n has order 2 n. The elements consist of the n rotations and n re- 
flections of a regular polygon with n sides. The n rotations are the counterclockwise 
rotations about the center through angles of degrees (k = 0, 1, . . . , n — 1). (Clock- 
wise rotations can be written in terms of counterclockwise rotations.) If n is odd, the 
n reflections are reflections in lines through a vertex and the center; if n is even, the 
reflections are reflections in lines joining opposite vertices and in lines joining midpoints 
of opposite sides. 
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4. The elements of D n can be written as permutations of {1,2,..., n}. See the following 
figure for the rigid motions in D 4 (the rigid motions of the square) and the following 
table for the group multiplication table for D4. 


2 1 

3 4 

e = 0° CCW 
rotation 

( 1 ) 

1 2 

3 4 

reflection in 
vertical line 


1 

4 

2 

3 

90° CCW 
rotation 

(1 2 3 4) 

3 

4 

2 

1 


reflection in 
horizontal line 


4 3 

1 2 

180° CCW 
rotation 

(1 3) (2 4) 

4 1 

3 2 

reflection in 
1-3 diagonal 


3 

2 

4 

1 

270° CCW 
rotation 

(1 4 3 2) 

2 

3 

1 

4 


reflection in 
2-4 diagonal 




(1 2)(3 4) ( 

1 4) (2 3) 

(2 4) 

(1 3) 




(1) 

(1234) 

(13) (24) 

(1432) 

(12) (34) (14) (23) 

(24) 

(13) 

(i) 

(1) 

(1234) 

(13) (24) 

(1432) 

(12) (34) (14) (23) 

(24) 

(13) 

(1234) 

(1234) 

(13) (24) 

(1432) 

(1) 

(24) 

(13) 

(14) (23) (12) (34) 

(13) (24) 

(13) (24) 

(1432) 

(1) 

(1234) 

(14) (23) (12) (34) 

(13) 

(24) 

(1432) 

(1432) 

(1) 

(1234) 

(13) (24) 

(13) 

(24) 

(12) (34) (14) (23) 

(12) (34) 

(12) (34) 

(13) 

(14) (23) 

(24) 

(1) 

(13) (24) 

(1432) 

(1234) 

(14) (23) 

(14) (23) 

(24) 

(12) (34) 

(13) 

(13) (24) 

(1) 

(1234) 

(1432) 

(24) 

(24) 

(21) (34) 

(13) 

(14) (23) 

(1234) 

(1432) 

(1) 

(13) (24) 

(13) 

(13) 

(14) (23) 

(24) 

(12) (34) 

(1432) 

(1234) 

(13) (24) 

(1) 


5. The Klein four-group consists of the following four rigid motions of a rectangle: the 
rotations about the center through 0° or 180°, and reflections through the horizontal 
or vertical lines through its center, as illustrated in the following figure. The following 
table is the multiplication table for the Klein four-group. 


2 14 31 23 4 

3 4 1 2 4 3 2 1 


e = 0° CCW 180° CCW reflection in reflection in 

rotation rotation vertical line horizontal line 

(1) (1 3)(2 4) (1 2)(3 4) (1 4)(2 3) 



(1) (13) (24) (12) (34) (14) (23) 

(1) 

(13) (24) 
(12) (34) 

(14) (23) 

(1) (13) (24) (12) (34) (14) (23) 

(13) (24) (1) (14) (23) (12) (34) 

(12) (34) (14) (23) (1) (13) (24) 

(14) (23) (12) (34) (13) (24) (1) 
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6. The Klein four-group is isomorphic to Z£. 

(2) . • [2l 

7. The induced permutation group S n and the ordered-pair-action group S l n are used 
in enumerative graph theory. (See §8.9.1.) 

8. The induced permutation group Sn' 1 has ( ” ) objects and nl permutations. 

9. The ordered-pair-action permutation group has n 2 objects and n! permutations. 


5.4 RINGS 


5.4.1 BASIC CONCEPTS 
Definitions: 

A ring (R, +, •) consists of a set R closed under binary operations + and • such that: 

• (R,+) is an abelian group; i.e., (R,+) satisfies: 

o associative property : a + (b + c) = (a + b) + c for all a, b, c € R ; 
o identity property: R has an identity element , 0, that satisfies 0 + a = 
a + 0 = a for all a € R; 

o inverse property: for each a € R there is an additive inverse element — a £ R 
(the negative of a) such that — a + a = a + (—a) = 0; 
o commutative law: a + b = b + a for all a,b £ R; 

• the operation • is associative: a ■ (b • c) = (a ■ b) ■ c for all ci,b,c £ R ; 

• the distributive properties for multiplication over addition hold for all a,b,c £ R: 

o left distributive property: a ■ (b + c) = a ■ b + a ■ c; 
o right distributive property: (a + b) ■ c = a ■ c + b ■ c. 

A ring R is commutative if the multiplication operation is commutative: a ■ b = b ■ a 
for all a, b £ R. 

A ring R is a ring with unity if there is an identity, 1 (^ 0), for multiplication; i.e., 
1 • a = a ■ 1 = a for all a £ R. The multiplicative identity is the unity of R. 

An element x in a ring R with unity is a unit if x has a multiplicative inverse; i.e., there 
is x~ l £ R such that x • x~ l = x~ x ■ x = 1. 

Subtraction in a ring is defined by the rule a — b = a + (—b) . 

Facts: 

1. Multiplication, a ■ b, is often written ab or a x b. 

2. The order of precedence of operations in a ring follows that for real numbers: multi- 
plication is to be done before addition. That is, a+bc means a+(bc) rather than ( a+b)c . 

3. In all rings, a0 = 0a = 0. 

4. Properties of subtraction: 

— (—a) = a (—a) (—6) = ab a(b — c) = ab — ac (a — b)c = ac — bc 
a{—b) = (—a)b = —(ab) (— l)a = —a (if the ring has unity). 

5. The set of all units of a ring is a group under the multiplication defined on the ring. 
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Examples: 

1. Table 1 gives several examples of rings. 

2 . Polynomial rings: For a ring R, the set 

i?[x] = { a n x n + • • • + a\X + ao | a 0 , ai, . . . , a n £ R } 
forms a ring, where the elements are added and multiplied using the “usual” rules for 
addition and multiplication of polynomials. The additive identity, 0, is the constant 
polynomial p{x) = 0; the unity is the constant polynomial p(x) = 1 if R has a unity 1. 
(See §5.5.) 

3. Product rings: For rings R and S, the set R x S = { (r, s) | r £ R, s £ S } forms a 
ring, where 

(ri, si) + (r 2 , s 2 ) = (r i + r 2 , si + s 2 ); 

(ri,si) • (r 2 ,s 2 ) = (rir 2 ,sis 2 ). 

The additive identity is (0,0). Unity is (1,1) if R and S each have unity 1. Product 
rings can have more than two factors: R\ x 1? 2 x • • ■ x Rk or R n = R x • • • x R. 


5.4.2 SUBRINGS AND IDEALS 

Definitions: 

A subset S' of a ring (!?,+,-) is a subring of R if (S, +, •) is a ring using the same 
operations + and • that are used in R. 

A subset I of a ring (R, +, •) is an ideal of R if: 

• (/, +, •) is a subring of ( R , +, •); 

• I is closed under left and right multiplication by elements of R: if a : £ I and 

r £ R, then rx £ I and xr £ I. 

In a commutative ring R , an ideal I is principal if there is r £ R such that I = Rr = 
{xr \ x £ R}. I is the principal ideal generated by r, written I = (r). 

In a commutative ring R , an ideal I =/= R is maximal if the only ideal properly con- 
taining I is R. 

In a commutative ring R , an ideal I ^ R is prime if ab £ I implies that a £ I or b £ I. 

Facts: 

1. If S is a nonempty subset of a ring ( R , +, •), then S' is a subring of R if and only 
if S is closed under subtraction and multiplication. 

2 . An ideal in a ring (!?,+,•) is a subgroup of the group (R, +), but not necessarily 
conversely. 

3. The intersection of ideals in a ring is an ideal. 

4 . If R is any ring, R and {0} are ideals, called trivial ideals. 

5. In a commutative ring with unity, every maximal ideal is a prime ideal. 

6. Every ideal / in the ring Z is a principal ideal. I = (r) where r is the smallest 
positive integer in I. 

7 . If I? is a commutative ring with unity, then R is a field (see §5.6) if and only if the 
only ideals of R are R and {0}. 
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Table 1 Examples of rings. 


set and addition and multiplication operations 

0 

1 

{0}, usual addition and multiplication; (trivial ring) 

0 

none 

Z , Q, TZ, C, with usual + and • 

0 

i 

Z n ={0, 1, . . . , n — 1} (n a positive integer), a+b = 

( a+b ) mod n, a-b = (ab) mod n; (modular ring) 

0 

i 

Z[V2\={a+bV2 | a, b£ Z}, (a+by/2)+(c+dV2)= 
((i~\~c) (h~\~d) \[2, (u~\~b\/ 2) • (c-\~dy/2)=(ac~\~2bd)-\~ 
(ad+bc)\/2 [Similar rings can be constructed using 
sfn (n an integer) if sjn not an integer.] 

O+Ov^ 

1+(V2 

Z[i] = { a + bi \ a, b G Z }; (Gaussian integers; 
see §5.4.2, Example 2.) 

0+0i 

1+Oi 

A4 nX n(R ) = all n x n matrices with entries in a 
ring R with unity, matrix addition and multipli- 
cation; (matrix ring) 

O n 

(zero 

matrix) 

In 

(identity 

matrix) 

/?={/[ /: A— >B} (A any nonempty set and B 
any ring), (f+g)(x) = f(x)+g(x), (f-g)(x) = 
f(x)-g(x ); (ring of functions) 

/ such that 
f(x)= 0 for 
all x&A 

/ such that 
f(x)= 1 for 
all x&A (if 

B has unity) 

V(S) = all subsets of a set S', A+B = AAB = 

(AUB) — (AnB) (symmetric difference), A-B = 

AnB; (Boolean ring) 

0 

S 

{ci+bi+cj+dk | a, b, c, d £ 7Z}, i, j, k in 
quaternion group, elements are added and 
multiplied like polynomials using ij = k, etc.; 

(ring of real quaternions , §5.2.2) 

O+Oz+Oj+Ofe 

lTOiTO/TO/c 


8. An ideal in a ring is the analogue of a normal subgroup in a group. 

9. The second condition in the definition of ideal can be stated as rl C I (I is a left 
ideal) and Ir C I (/ is a right ideal). (If A is a subset of a ring R and r £ R, then 
rA = { r a \ a € A } and Ar = { ar \ a € A } .) 

Examples: 

1. With the usual definitions of + and • , each of the following rings can be viewed as 
a subring of all the rings listed after it: Z, Q , 1Z, C. 

2. Gaussian integers: Z[i] = { a + bi \ a, b £ Z } using the addition and multiplication 
of C is a subring of the ring of complex numbers. 

3. The ring Z is a subring of Z\\/ 2] and Z\\J 2] is a subring of TZ. 

4. Each set nZ (n an integer) is a principal ideal in the ring Z. 
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5.4.3 RING HOMOMORPHISM AND ISOMORPHISM 


Definitions: 

If R and S are rings, a function p: R ^ S is a ring homomorphism if for all a. b G II: 

• p(a + b) = p(a) + p(b) (p preserves addition) 

• p(ab) = p(a)p(b). (p preserves multiplication) 

Note: p(a) is sometimes written aip. 

If a ring homomorphism p is also one-to-one and onto S, then p is a ring isomorphism 
and R and S are isomorphic , written R = S. 

A ring endomorphism is a ring homomorphism p: R — > R. 

A ring automorphism is a ring isomorphism ip: R — > II. 

The kernel of a ring homomorphism p: R — > S is <p _1 (0) = { x £ R \ p(x) = 0 }. 

Facts: 

1. If p is a ring isomorphism, then p~ x is a ring isomorphism. 

2. The kernel of a ring homomorphism from i? to S is an ideal of the ring R. 

3. If p: R — > S is a ring homomorphism, p(R) is a subring of S. 

4. If p: R — > S is a ring homomorphism and R has unity, either p{l) = 0 or p{\) is 
unity for p(R). 

5. If p is a ring homomorphism, then <p(0) = 0 and p(—a) = —p(a). 

6. A ring homomorphism is a ring isomorphism between R and p(R) if and only if the 
kernel of is {0}. 

7. Homomorphisms preserve subrings: Let p: R — > S' be a ring homomorphism. If A 
is a subring of R, then p{A) is a subring of S. If B is a subring of S, then p^ 1 (B) is a 
subring of R. 

8. Homomorphisms preserve ideals: Let p: R — > S be a ring homomorphism. If A is 
an ideal of R, then p{A) is an ideal of S. If B is an ideal of S, then p~ 1 (B) is an ideal 
of R. 

Examples: 

1. The function p: Z — > Z n defined by the rule p(a) = a mod n is a ring homomor- 
phism. 

2. If R and S are rings, then the function p: R —> S defined by the rule p(a) = 0 for 
all a € I? is a ring homomorphism. 

3. The function p: Z — > R (R any ring with unity) defined by the rule p(x) = x ■ 1 is a 
ring homomorphism. The kernel of p is the subring nZ for some nonnegative integer n, 
called the characteristic of R. 

4. Let V(S) be the ring of all subsets of a set S (see Table 1). If \S\ = 1, then 
V(S) = Z 2 with the ring isomorphism p where p($) = 0 and p(S) = 1. More generally, 
if \S\ = n, then V(S) ^ Z? = Z 2 x ■ ■ ■ x Z 2 . 

5. Z n = Z/(n) for all positive integers n. (See §5.4.4.) 

6. Z m x Z n = Z mn , if m and n are relatively prime. 
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5.4.4 QUOTIENT RINGS 


Definitions: 

If I is an ideal in a ring R and a £ R, then the set a + / = { a + a; | £ (E / } is a coset 
of I in R. 

The set of all cosets, R/I = { a + 1 \ a £ R }, is a ring, called the quotient ring , where 
addition and multiplication are defined by the rules: 

• (o + I) + (6 + /) = (a + b) + /; 

• (a + I) ■ (b + I) = ( ab ) + I. 

Facts: 

1. If I? is commutative, then R/I is commutative. 

2. If R has unity 1, then R/I has the coset 1 + / as unity. 

3. If I is an ideal in ring R, the function p: R — ► R/I defined by the rule p(x) = x + 1 
is a ring homomorphism, called the natural map. The kernel of p is I. 

4. Fundamental homomorphism theorem for rings: If p is a ring homomorphism and K 
is the kernel of p, then p(R) = R/K. 

5. If R is a commutative ring with unity and I is an ideal in R , then I is a maximal 
ideal if and only if R/I is a field (see §5.6). 

Examples: 

1. For each integer n, Z/nZ is a quotient ring, isomorphic to Z n . 

2. See §5.6.1 for Galois rings. 


5.4.5 RINGS WITH ADDITIONAL PROPERTIES 


Beginning with rings, as additional requirements are added, the following hierarchy of 
sets of algebraic structures is obtained: 


commutative 

rings D rings with D 
unity 


integral 

domains 


Euclidean 

domains 


principal 
D ideal 
domains 


Definitions: 

The cancellation properties in a ring R state that for all u. b. c G R: 
if ab = ac and a/0, then b = c ( left cancellation property) 
if ba = ca and a / 0, then b = c ( right cancellation property). 

Let I? be a ring and let a,b £ R where a ^ 0,6 / 0. If ab = 0, then a is a left divisor 
of zero and b is a right divisor of zero. 

An integral domain is a commutative ring with unity that has no zero divisors. 

A principal ideal domain ( PID ) is an integral domain in which every ideal is a 
principal ideal. 

A division ring is a ring with unity in which every nonzero element is a unit (i.e., 
every nonzero element has a multiplicative inverse). 

A field is a commutative ring with unity such that each nonzero element has a multi- 
plicative inverse. (See §5.6.) 
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A Euclidean norm on an integral domain R is a function 5: R — {0} — > {0, 1,2, . . .} 
such that: 

• <5(a) < 6(ab) for all a, b G R — {0}; 

• the following generalization of the division algorithm for integers holds: for all 

a,d € R where d yf 0, there are elements q, r £ R such that a = dq + r, where 
either r = 0 or S(r ) < 6(d). 

A Euclidean domain is an integral domain with a Euclidean norm defined on it. 

Facts: 

1. The cancellation properties hold in an integral domain. 

2. Every finite integral domain is a field. 

3. Every integral domain can be imbedded in a field. Given an integral domain R , 
there is a field F and a ring homomorphism ip: R — > F such that </?(l) = 1. 

4. A ring with unity is a division ring if and only if the nonzero elements form a group 
under the multiplication defined on the ring. 

5. Wedderburn’s theorem: Every finite division ring is a field. (J. H. M. Weclclerburn, 
1882 - 1948 ) 

6. Every commutative division ring is a field. 

7. In a Euclidean domain, if b yf 0 is not a unit, then 5(ab) > 5(a) for all a yf 0. For 
b y^ 0, b is a unit in R if and only if 5(b) = 5 ( 1 ). 

8. In every Euclidean domain, a Euclidean algorithm for finding the gcd can be carried 
out. 

Examples: 

1. Some common Euclidean domains are given in the following table. 


set 

Euclidean norm 

Z 

5(a) = a 

Z[i ] (Gaussian integers) 

5(a + bi) = a 2 + b 2 

F (any field) 

5(a) = 1 

polynomial ring F[x ] (F any field) 

5(p(x)) = degree of p(x) 


2. The following table gives examples of rings with additional properties. 


ring 

commuta- 
tive ring 
with unity 

integral 

domain 

principal 

ideal 

domain 

Euclidean 

domain 

division 

ring 

held 

z 

yes 

yes 

yes 

yes 

no 

no 

Q, n, c 

yes 

yes 

yes 

yes 

yes 

yes 

Z p (p prime) 

yes 

yes 

yes 

yes 

yes 

yes 

Z n (n composite) 

yes 

no 

no 

no 

no 

no 

real quaternions 

no 

no 

no 

no 

yes 

no 

Z[x] 

yes 

no 

no 

no 

no 

no 

•^nxn 

no 

no 

no 

no 

no 

no 
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5.5 POLYNOMIAL RINGS 


5.5.1 BASIC CONCEPTS 
Definitions: 

A polynomial in the variable x over a ring II is an expression of the form 
f(x) = a n x n + a n _ ix n ^ x + • • • + aix 1 + clqX 0 
where a n , . . . , do £ R. 

For a polynomial f(x ) ^ 0 , the largest integer k such that 0 is the degree of f(x), 
written deg f(x). 

A constant polynomial is a polynomial f(x) = ao- If ao y^ 0 , /( x) has degree 0 . If 
f(x) = 0 (the zero polynomial), the degree of f{x) is undefined. (The degree of the 
zero polynomial is also said to be — oo.) 

The polynomial ring (in one variable x) over a ring R consists of the set 
R[x] = { /( x) | f{x) is a polynomial over R in the variable x } 
with addition and multiplication defined by the rules: 

(■ a n x n H 1- aix 1 + a 0 x°) + ( b m x m H h fox 1 + b 0 x°) 

= a n x n + • • • + a m+ ix rn ~ l + (a n + b n )x n + • • • + (ai + b\)x^ + (ao + 

if n > m, and 

(a n x n + • • • + aix 1 + aoX 0 )(b m x m + • • • + 61a; 1 + b 0 x°) 

= c n+m x n+m H 1- Cl a; 1 + c 0 a; 0 

where Cj = a^bi + ai6,_i + • • • + aj&o for i = 0, 1, . . . , to + n. 

A polynomial f(x) € R[x] of degree n is monic if a n = 1 . 

The value of a polynomial f(x) = a n x n + a„_ ix" _1 + • • • + aia: 1 + aocc 0 at c € R is the 
element /(c) = a n c n + a n _ic n_1 + • • • + aic + ao £ R. 

An element c G R is a zero of the polynomial f(x) if /(c) = 0 . 

If R is a subring of a commutative ring S, an element a € S' is algebraic over R if 
there is a nonzero f(x) £ R[x\ such that /(a) = 0 . 

If p(x) is not algebraic over R, then p(x) is transcendental over R. 

A polynomial f(x) £ i?[x] of degree n is irreducible over R if /( x) cannot be written 
as fi(x)f2(x) ( factors of f(x)) where fi(x) and f2(x) are polynomials over R of degrees 
less than n. Otherwise f(x) is reducible over R. 

The polynomial ring (in the variables #i, X2, ■ ■ ■ , x n with n > 1 ) over a ring R is 
defined by the rule R[x 1,^2, . . . , x n \ = (i?[ X\,X2, • • • , x n -i\)[x n }. 

Facts: 

1 . Polynomials over an arbitrary ring R generalize polynomials with coefficients in 1 Z 
or C. The notation and terminology follow the usual conventions for polynomials with 
real (or complex) coefficients: 
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• the elements a n , ... ,ao are coefficients; 

• subtraction notation can be used: ciiX 1 + (—aj)x J = aiX 1 — ajx 

• the term la;* can be written as x l ; 

• the term x 1 can be written x; 

• the term x° can be written 1; 

• terms Cte* can be omitted. 

2. There is a distinction between a polynomial f(x) £ i?[x] and the function it defines 
using the rule /(c) = a n c n + a„_ic n_1 + • • • + arc + a o for c £ R. The same function 
might be defined by infinitely many polynomials. For example, the polynomials fi ( x ) = 
x € Z 2 [x] and f 2 (x) = x 2 € Z 2 [x\ define the same function: /i(0) = / 2 (C)) = 0 and 

A(i) = h( i) = 1. 

3. If R is a ring, R[ x\ is a ring. 

4. If I? is a commutative ring, then R[x] is a commutative ring. 

5. If I? is a ring with unity, then R[ x\ has the constant polynomial /( x) = 1 as unity. 

6. If R is an integral domain, then i?[x] is an integral domain. If f\{x) has degree m 

and f 2 {x) has degree n, then the degree of fi(x)f 2 (x) is m + n. 

7. If ring R is not an integral domain, then R[x] is not an integral domain. If fi(x) 
has degree m and f 2 (x) has degree n, then the degree of fi(x)f 2 (x) can be smaller than 
m + n. (For example, in Z e [x], (3x 2 )(2x 3 ) = 0.) 

8. Factor theorem: If I? is a commutative ring with unity and f(x) £ R[x\ has de- 
gree > 1, then /(a) = 0 if and only if x — a is a factor of f(x). 

9. If R is an integral domain and p(x) £ R[x] has degree n, then p(x) has at most n 

zeros in R. If R is not an integral domain, then a polynomial may have more zeros than 

its degree; for example, x 2 + x £ Z e [x\ has four zeros — 0, 2, 3, 5. 


5.5.2 POLYNOMIALS OVER A FIELD 


Facts: 

1. Even though F is a field (§5.6.1), F[x\ is never a field. (The polynomial f(x) = x 
has no multiplicative inverse in F'fx].) 

2. If f(x) has degree n, then f(x) has at most n distinct zeros. 

3. Irreducibility over a finite field : If F is a finite field and n is a positive integer, then 
there is an irreducible polynomial over F of degree n. 

4. Unique factorization theorem: If f{x) is a polynomial over a field F and is not the 
zero polynomial, then f(x) can be uniquely factored (ignoring the order in which the 
factors are written) as afi(x) ■ ■ ■ fk(x) where a £ F and each fi{x) is a monic polynomial 
that is irreducible over F . 

5. Eisenstein’s irreducibility criterion: If f(x) £ Z[x\ has degree n > 0, if there is 
a prime p such that p divides every coefficient of f(x) except a n , and if p 2 does not 
divide do, then f(x) is irreducible over Q. (F. G. M. Eisenstein, 1823-1852) 

6. Division algorithm for polynomials: If F is a field with a(x), d(x) £ F[x\ and d{x) 
is not the zero polynomial, then there are unique polynomials q(x) ( quotient ) and r{x) 
( remainder ) in F[x\ such that a(x) = d(x)q(x) + r(x) where degr(a;) < degd(a;) or 
r(x ) = 0. If d(x) is monic, then the division algorithm for polynomials can be extended 
to all rings with unity. 
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7. Irreducibility over the real numbers 7 Z: If f(x) G lZ[x) has degree at least 3, then 
fix) is reducible. The only irreducible polynomials in lZ[x\ are of degree 1 or 2; for 
example x 2 + 1 is irreducible over 1Z. 

8 . Fundamental theorem of algebra (irreducibility over the complex numbers C ): If 
f(x) € C[x\ has degree n > 1, then f(x) can be completely factored: 

/( x) = c(x - ci) (a; - c 2 ) . . . (x - c n ) 

where c, ci, . . . , c n G C. 

9. If F is a field and f(x) G F[x] has degree 1 (i.e., /( x) is linear), then f(x) is 
irreducible. 

10. If F is a field and f(x) G F[x] has degree > 2 and has a zero, then f(x) is 
reducible. (If f(x) has a as a zero, then f(x) can be written as (x — a)fi(x) where 
deg/i(a;) = degf(x) — 1. The converse is false: a polynomial may have no zeros, but 
still be reducible. (See Example 2.) 

11. If F is a field and f(x) G F[x] has degree 2 or 3, then f(x) is irreducible if and 
only if f(x) has no zeros. 

Examples: 

1. In Z$ [a;], if a(x) = 3x 4 + 2x 3 + 2x + 1 and d(x) = x 2 + 2, then q(x) = 3x 2 + 2x + 4 
and r(x) = 3x + 3. To obtain q(x) and r(x), use the same format as for long division of 
natural numbers, with arithmetic operations carried out in Zy. 

3x 2 + 2x + 4 

x 2 + 2)3x 4 + 2a ; 3 + Oa ; 2 + 2a; + 1 

3a ; 4 + x 2 

2x 3 + 4a ; 2 

2a ; 3 + 4a; 

4a ; 2 + 3a; 

4a ; 2 + 3 

3a; + 3 

2 . Polynomials can have no zeros, but be reducible. The polynomial f(x) = x 4 + x 2 + 
1 G Z 2 [x\ has no zeros (since /( 0) = /( 1) = 1), but f(x) can be factored as (a : 2 + a; + l) 2 . 
Similarly, a ; 4 + 2a ; 2 + 1 = (a ; 2 + l ) 2 G !Z[x]. 


[— x 2 = Ax 2 over Z 5 \ 

[2x — 4a; = —2x = 3a; over Z 5 ] 


5.6 FIELDS 


5.6.1 BASIC CONCEPTS 

Definitions: 

A field (F, +, •) consists of a set F together with two binary operations, + and -, such 
that: 

• (F, +,-) is a ring; 

• (F — {0}, •) is a commutative group. 
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A subfield F of field (AT, +, •) is a subset of K that is a field using the same operations 
as those in K. 

If F is a subfield of K, then K is called an extension field of F. Write K/F to indicate 
that K is an extension field of F. 

For I\ an extension field of F, the degree of K over F is [K : F] = the dimension of K 
as a vector space over F. (See §6.1.3.) 

A field isomorphism is a function <p: F± — > F 2 , where F\ and F 2 are fields, such that p 
is one-to-one, onto F 2 , and satisfies the following for all a,b £ Fp. 

• ( p(a + b) = ip(a) + <p{b)\ 

• ip(cib) = <p{a)<p{b). 

A field automorphism is an isomorphism p: F — > F , where F is a field. The set of 
all automorphisms of F is denoted Aut(F). 

The characteristic of a field F is the smallest positive integer n such that 1+- • -+1 = 0, 
where there are n summands. If there is no such integer, F has characteristic 0 (also 
called characteristic 00 ). 

Facts: 

1. Every field is a commutative ring with unity. A field satisfies all properties of 
a commutative ring with unity, and has the additional property that every nonzero 
element has a multiplicative inverse. 

2. Every finite integral domain is a field. 

3. A field is a commutative division ring. 

4. If F is a field and a, b £ F where a yf 0, then ax + b = 0 has a unique solution in F. 

5. If F is a held, every ideal in F[x\ is a principal ideal. 

6. If p is a prime and n is any positive integer, then there is exactly one held (up to 
isomorphism) with p n elements, the Galois field GF(jp n ). (§5.6.2) 

7. If (p: F — > F is a held automorphism, then: 

• ~<p(a) = ¥>(-a) 

• ^(cF 1 ) = V(a )- 1 

for all a / 0. 

8. The intersection of all subhelds of a held F is a held, called the prime field of F. 

9. If F is a held, Aut(F) is a group under composition of functions. 

10. The characteristic of a held is either 0 or prime. 

1 1 . Every held of characteristic 0 is isomorphic to a held that is an extension of Q and 
has Q as its prime held. 

12. Every held of characteristic p > 0 is isomorphic to a held that is an extension of Z p 
and has Z p as its prime held. 

13. If held F has characteristic p > 0, then (a + b) p = a p + b p for all a,b £ F. 

14. If held F has characteristic p > 0, /( x) £ Z p [x\, and a £ F is a zero of f(x), then 
a p , a p ,a p , . . . are also zeros of /( x). 

15. If p is not a prime, then Z p is not a held since Z p — {0} will fail to be closed under 
multiplication. For example, Z 6 is not a held since 2 £ Z 6 — {0} and 3 £ Z 6 — {0}, but 
2-3 = 0£Z 6 -{0}. 
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Examples: 


1. The following table gives several examples of fields. 


set and operations 

—a 

a" 1 

charac- 

teristic 

order 

Q, 1Z, C, with usual addition 
and multiplication 

—a 

1/a 

0 

infinite 

Z p = {0, 1, . . . ,p- 1} (p prime) 
prime), addition and multipli- 
cation mod p 

p — a 
(-0 = 0) 

a -1 = 6, where 
ab mod p = 1 

P 

P 

F[x\/(f(x)), f(x) irreducible 
over field F, coset addition and 
multiplication (Example 2) 

-[a+(/(x))]= 

-a+(/(x)) 

[«+ (/(®))]- X = 
a 1 +(f(x)) 

varies 

varies 

GF(p n )=Z p [x]/(f(x)), f(x) of 
degree n irreducible over Z p (p 
prime), addition and multipli- 
cation of cosets ( Galois field ) 

— [a+(/(x))]= 

— a+(/(x)) 

[«+(/(^))]- 1 = 
a 1 +{f(x)) 

p 

p n 


2. The field F[x]/(f(x)): If F is any field and f(x) £ F[x] of degree n is irreducible 
over F, the quotient ring structure F[x]/(f(x)) is a field. The elements of F[x]/(f(x)) 
are cosets of polynomials in F[x\ modulo /(x), where (/(x)) is the principal ideal gen- 
erated by f(x). Polynomials fi(x) and / 2 (x) lie in the same coset if and only if f(x) is 
a factor of /i(x) — / 2 (x). 

Using the division algorithm for polynomials, any polynomial g{x) £ F[x\ can be 
written as g(x) = f(x)q(x) + r(x) where q(x) and r(x) are unique polynomials in F[x] 
and r(x) has degree < n. The equivalence class g(x) + (f(x)) can be identified with the 
polynomial r(x), and thus F[x]/(f(x)) can be regarded as the held of all polynomials 
in F[x\ of degree < n. 


5.6.2 EXTENSION FIELDS AND GALOIS THEORY 

Throughout this subsection assume that held K is an extension of held F. 

Definitions: 

For a £ K, F(a) is the smallest held containing a and F, called the field extension 
of F by a. 

For a\ , ,a n € K, F(a±, . . . , a n ) is the smallest field containing ct\ ,a n and F, 
called the field extension of F by cq, . . . , a n . 

If K is an extension field of F and a £ K, then a is algebraic over F if a is a root of 
a nonzero polynomial in A[x]. If a is not the root of any nonzero polynomial in F[x], 
then a is transcendental over F. 

A complex number is an algebraic number if it is algebraic over Q. 

An algebraic integer is an algebraic number a that is a zero of a polynomial of the 
form x n + a n - ix n_1 + • • • + aix + ao where each cii £ Z. 

An extension field K of F is an algebraic extension of F if every element of K is 
algebraic over F. Otherwise K is a transcendental extension of F. 
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An extension field K of A is a finite extension of A if AT is finite-dimensional as a 
vector space over A (see Fact 11). The dimension of K over F is written [K: A]. 

Let a be algebraic over a field F. The minimal polynomial of a with respect to F 
is the monic irreducible polynomial f(x) € F[x) of smallest degree such that /(a) = 0. 

A polynomial f(x) £ F[x] splits over I\ if f(x) = a(x — ai ) ... (a; — a n ) where a, a\, 
• • ■ - o n £ K . 

AT is a splitting field ( root field) of a nonconstant f(x) £ F[x } if f(x) splits over K 
and K is the smallest field with this property. 

A polynomial f(x) £ F[x] of degree n is separable if f(x) has n distinct roots in its 
splitting held. 

K is a separable extension of F if every element of K is the root of a separable 
polynomial in A[x]. 

AT is a normal extension of F if K/ F is algebraic and every irreducible polynomial 
in A[x] with a root in AT has all its roots in K (i.e., splits in K). 

AT is a Galois extension of F if K is a normal, separable extension of F. 

A held automorphism ip fixes set S elementwise if tp(x) = x for all x £ S. 

The fixed field of a subset A C Aut(A) is Fa = { x £ F \ ip(x) = x for all <p £ A }. 

The Galois group of K over F is the group of automorphisms G(K/F) of K that 
hx F elementwise. If AT is a splitting held of f(x) £ F[ x\, G(K/F ) is also known as the 
Galois group of f(x). (Evariste Galois, 1811-1832) 

Facts: 

1. The elements of K that are algebraic over F form a subheld of K. 

2. The algebraic numbers in C form a held; the algebraic integers form a subring of C, 
called the ring of algebraic integers. 

3. Every nonconstant polynomial has a unique splitting held, up to isomorphism. 

4. If f(x) £ F[x] splits as a(x — a ± ) ...(x — a n ), then the splitting held for f(x) is 
F(pt \ , . . . , On) ■ 

5. If A is a held and p(x) £ A[x] is a nonconstant polynomial, then there is an extension 
held K of F and a £ K such that p(a) = 0. 

6. If /(x) is irreducible over A, then the ring A[x]/(/(x)) is an algebraic extension of F 
and contains a root of /(x). 

7. The held F is isomorphic to a subheld of any algebraic extension F[x]/(f(x)). The 
element 0 £ F corresponds to the coset of the zero polynomial; all other elements of F 
appear in F[x]/(f(x)) as cosets of the constant polynomials. 

8. Every minimal polynomial is irreducible. 

9. If A' is a held extension of F and a £ K is a root of an irreducible polynomial 
f(x) £ F[x] of degree n > 1 , then F(a) = {c„_ia n_1 + • • • + C\ol + cq\ Ci £ F for all i}. 

10 . If K is an extension held of F and a £ A' is algebraic over A, then: 

• there is a unique monic irreducible polynomial f(x) £ F[x\ of smallest degree 

(the minimum polynomial ) such that f(a) = 0; 

• F(a) - F [x\ /{f{x))\ 

• if the degree of a over F is n, then K = { ao + a \a + aio? + • • • + | 

ao, ai, . . . , a„_i £ F }; in fact, K is an n-dimensional vector space over A, with 
basis 1, a, a 2 , . . . , a n_1 . 
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11. If K is an extension field of F and x £ K is transcendental over F, then F(a) = 
the field of all fractions f(x)/g( x) where f(x), g(x) £ F[ x] and g(x) is not the zero 
polynomial. 

12. A' is a splitting field of some polynomial /(x) eF[x] if and only if K is a Galois 
extension of F. 

13. If K is a splitting field for separable f(x) £ F[ x\ of degree n, then G(K, F ) is 
isomorphic to a subgroup of the symmetric group S n . 

14. If K is a splitting field of f(x) £ F[x], then: 

• every element of G(K/F ) permutes the roots of f(x) and is completely determined 

by its effect on the roots of /(x); 

• G(K/F) is isomorphic to a group of permutations of the roots of f(x). 

15. If K is a splitting field for separable f(x) € F[x\, then \G(K/F)\ = [K:F], 

16. For [K: F] finite, K is a normal extension of F if and only if K is a splitting field 
of some polynomial in F[x\. 

17. The Fundamental theorem of Galois theory : If K is a normal extension of F, where 
F is either finite or has characteristic 0, then there is a one-to-one correspondence $ 
between the lattice of all fields K', where F C K' C K, and the lattice of all subgroups Ff 
of the Galois group G(K/F): 

<h(K') = G(K/K') and <f>~ 1 (H) = K H . 

The correspondence <I> has the following properties: 

• for fields K' and AT" where F C K' C K and F C K" C K 

K' C K" < — * ${K") C $(AT'); 
that is, G(K/K") C G(K/K f ). 

• <I> interchanges the operations meet and join for the lattice of subfields and the 

lattice of subgroups: 

<h(K' A K") = G(K/K') V G(K/K") 

<h(K' V K") = G(K/K) A G(K/K")\ 

(Note: In the lattice of helds [groups], A A B = A n B and A V B is the smallest 
field [group] containing A and B.) 

• AT' is a normal extension of F if and only if G(K/K') is a normal subgroup of 

G(K/F). 

18. Formulas for solving polynomial equations of degrees 2, 3, or 4: 

• second-degree ( quadratic ) equation ax 2 + bx + c = 0 : the quadratic formula gives 

, , —6 ± \Jb 2 — 4ac 

the solutions ; 


third-degree (cubic) equation + a 2 X 2 + aix + ao = 0 : 

( 1 ) divide by a 3 to obtain x 3 + b 2 x 2 + b±x + b 0 = 0 , 

( 2 ) make the substitution x = y — to obtain an equation of the form y 3 + 


cy + d = 0 , with solutions y= y l ^r+\J f T+yf+\j^r-\J^t + yf, 

(3) use the substitution x = y — ^ to obtain the solutions to the original 
equation; 


d 2 
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• fourth-degree ( quartic ) equation CI 4 X 4 + 03 a ; 3 + CL 2 X 2 + a\x + do = 0 : 

(1) divide by 0,4 to obtain x 4 + ax 3 + bx 2 + cx + d = 0, 

(2) solve the resolvent equation y 3 — by s + ( ac — 4 d)y + (— a 2 d + dbd — c 2 ) = 0 

to obtain a root z, 

(3) solve the pair of quadratic equations: 

a ; 2 + § 2 ; + § = ± y (x - b + z'jx 2 + (j^z - cjx + - d^j 

to obtain the solutions to the original equation. 

19 . A general method for solving cubic equations algebraically was given by Nicolo 
Fontana (1500-1557), also called Tartaglia. The method is often referred to as Cardano’s 
method because Girolamo Cardano (1501-1576) published the method. Ludovico Fer- 
rari (1522-1565), a student of Cardano, discovered a general method for solving quartic 
equations algebraically. 

20 . Equations of degree 5 or more : In 1824 Abel proved that the general quintic 
polynomial equation a $x 5 + • • • + a\X + Oo = 0 (and those of higher degree) are not 
solvable by radicals; that is, there can be no formula for writing the roots of such equa- 
tions using only the basic arithmetic operations and the taking of nth roots. Evariste 
Galois (1811-1832) demonstrated the existence of such equations that are not solvable 
by radicals and related solvability by radicals of polynomial equations to determining 
whether the associated permutation group (the Galois group) of roots is solvable. (See 
Application 1.) 


Examples: 

1. C as an algebraic extension of 1 Z: Let f(x) = x 2 + 1 £ lZ[x\ and a = x + (x 2 + 
1) £ 1Z[x\/(x 2 + 1). Then a 2 = — 1. Thus, a behaves like i (since i 2 = —1). Hence 
lZ[x\/(x 2 + 1) = { ci a + c 0 | ci,. Co £ 1Z} = { c\i + c 0 | c 0 , Ci £ TZ } = C. 

2. Algebraic extensions of Z p : If f(x) £ Z. p is an irreducible polynomial of degree n, 
then the algebraic extension Z p [x]/(f(x )) is a Galois field. 

3. If fix) = x 4 — 2x 2 — 3 £ Q[x], its splitting field is 

<2(\/3, i) = { a + by/', 3 + ci + diyj 3 | a,b,c,d £ Q}. 

There are three intermediate fields: <2(\/3), Q(i), and Q(iy/ 3), as illustrated in Figure 1. 
The Galois group G(Q(/ 3, i)/ Q) = {e, </>i, fa, <^ 3 } where: 

(/>i (a + b / 3 + ci + di/3) = a + b / 3 — ci — di/3, 

(f> 2 {a + b / 3 + ci + di/3) = a — b/3 + ci — di/3, 

4>/a + b/3 + ci + di/3) = a — b/3 — ci + di/3 = 4 > 2 <t>i = 

e(a + b/3 + ci + di/3) = a + b/3 + ci + di/3. 

G(Q(/3,i), Q) has the following subgroups: 

G = G(Q(/3,i),Q) = 

Hi = G(Q(Z 3, t), Q(v / 3)) = {e, fa}, 

H 2 =G(Q(Z3,i),Q(i)) = {e,fa}, 

H 3 = G(Q(Z 3, i), Q(iZ 3)) = {e, fa}, 

{e} = G(Q(/3,i),Q(/3,i)). 
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The correspondence between fields and Galois groups is shown in the following table 
and figure. 


field 

Galois group 

Q(v / 3,i) 

{e} 

Q(V 3) 

H 1 

Q(W 3) 

h 3 

Q(i) 

h 2 

Q 

G 


0(V3,i) {e} 



4. Cyclotomic extensions: The nth roots of unity are the solutions to x n — 1 = 0: 
1, u>, lo 2 , . . . ,to n ~ 1 , where to = e 2m ^ n . The extension field Q(u>) is a cyclotomic extension 
of Q. If p > 2 is prime, then G(Q(u>), Q) is a cyclic group of order p— 1 and is isomorphic 
to Z* (the multiplicative group of nonzero elements of Z p ). 

Applications: 

1. Solvability by radicals: A polynomial equation f(x) = 0 is solvable by radicals if 
each root can be expressed in terms of the coefficients of the polynomial, using only 
the operations of addition, subtraction, multiplication, division, and the taking of ?rth 
roots. 

If F is a held of characteristic 0 and /( x) £ F[x] has K as splitting held, then 
/( x) = 0 is solvable by radicals if and only if G(K, F ) is a solvable group. Since there 
are polynomials whose Galois groups are not solvable, there are polynomials whose 
roots cannot be found by elementary algebraic methods. For example, the polynomial 
x 5 — 36x + 2 has the symmetric group S 5 as its Galois group, which is not solvable. 
Hence, the roots of x 5 — 36x + 2 = 0 cannot be found by elementary algebraic methods. 
This example shows that there can be no algebraic formula for solving all fifth-degree 
equations. 

2. Straightedge and compass constructibility: Using only a straightedge (unmarked 
ruler) and a compass, there is no general method for: 

• trisecting angles (given an angle whose measure is a, to construct an angle with 

measure |); 

• duplicating the cube (given the side of a cube C\, to construct the side of a 

cube C 2 that has double the volume of Ci); 

• squaring the circle (given a circle of area A, to construct a square with area A)', 

• constructing a regular ?r-gon for all n > 3. 

Straightedge and compass constructions yield only lengths that can be obtained by 
addition, subtraction, multiplication, division, and taking square roots. Beginning with 
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lengths that are rational numbers, each of these operations yields field extensions Q{a) 
and Q(b) where a and b are coordinates of a point constructed from points in Q x Q. 
These operations force [ Q(a ): Q] and [Q(b): Q] to be powers of 2. However, trisecting 
angles, duplicating cubes, and squaring circles all yield extensions of Q such that the 
degrees of the extensions are not powers of 2. Hence these three types of constructions 
are not possible with straightedge and compass. 


5.6.3 FINITE FIELDS 

Finite fields have a wide range of applications in various areas of computer science 
and engineering applications: coding theory, combinatorics, computer algebra, cryptol- 
ogy, the generation of pseudorandom numbers, switching circuit theory, and symbolic 
computation. 

Throughout this subsection assume that F is a finite field. 

Definitions: 

A finite field is a field with a finite number of elements. 

The Galois field GF(p n ) is the algebraic extension Z p [x\/{f(x)) of the finite field Z p 
where p is a prime and f{x) is an irreducible polynomial over Z p of degree n. (See 
Fact 1.) 

A primitive element of GF(p n ) is a generator of the cyclic group of nonzero elements 
of GF(p n ) under multiplication. 

Let a be a primitive element of GF(p n ). The discrete exponential function (with 
base a) is the function exp a : {0, 1, 2, . . . ,p n — 2} — > GF(p n )* defined by the rule exp a fc = 
a k . 

Let a be a primitive element of GF{p n ). The discrete logarithm or index function 
(with base a) is the function ind Q : GF(p n )* — > {0, 1, 2, . . . ,p n — 2} where ind a (a;) = k 
if and only if x = a k . 

Let a be a primitive element of GF(p n ). The Zech logarithm ( Jacobi logarithm) is 
the function Z: {1, . . . ,p n — 1} — > {0, . . . ,p n — 2} such that a = l + a fc ; if \+a k = 0, 
then Z(k) = 0. 

Facts: 

1. Existence of finite fields: For each prime p and positive integer n there is exactly 
one field (up to isomorphism) with p n elements — the field GF{p n ), also written F p n. 

2. Construction of finite fields: Given an irreducible polynomial /( x) € Z p [x\ of de- 
gree n and a zero a of fix), 

GF(p n ) ^ Z p [x]/(f(x)) = {c n _ 1 a n ~ 1 H 1- c x a + c 0 | q G Z p for all *}. 

3. If F is a finite field, then: 

• F has p n elements for some prime p and positive integer n; 

• F has characteristic p for some prime p; 

• F is an extension of Z p . 

4. [GF{p n ):Z p ] =n. 

5. GF{p n ) = the field of the p n roots of — x £ Z p [x\. 
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6. The minimal polynomial of a € GF(p n ) with respect to Z p is 

f(x) = {x — a)(x — a p )(x — a p2 ) . . . (x — a pt ) 
where i is the smallest positive integer such that a p + = a. 

7. If a field F has order p n , then every subheld of F has order p k for some k that 
divides n. 

8. The multiplicative group of nonzero elements of a finite held F is a cyclic group. 

9. If a held F has m elements, then the multiplicative order of each nonzero element 
of F is a divisor of rn — 1 . 

10 . If a held F has m elements and d is a divisor of m — 1, then there is an element 
of F of order d. 

11 . Each discrete logarithm function has the following properties: 

ind Q (a:y) = ind^a: + ind Q y (mod p n — 1); 
ind Q (a:y _1 ) = ind Q a: — ind Q ?/ (mod p n — 1); 
ind Q (a: fe ) = k ind a a; (mod p n — 1). 

12 . The discrete logarithm function ind a is the inverse of the discrete exponential 
function exp a . That is, ind a a; = y if and only if exp a y = x. 

13 . A discrete logarithm function can be used to facilitate multiplication and division 
of elements of GF(jp n ). 

14 . The Zech logarithm facilitates the addition of elements a 1 and a? (i > j) in GF(p n ), 

since a 1 + ad = o?{a l ~^ + 1) = a? ■ a z ^ % ~^ — aJ+z(»-j)_ that the values of the 

Zech logarithm function depend on the primitive element used.) 

15 . There are \ Jj /i(|)p nd irreducible polynomials of degree k over GF(p n ), where /j 

d\k 

is the Mobius function (§2.7). 


Examples: 

1. If p is prime, Z p is a hnite held and Z p = GF(p). 

2. The held Z 2 = F 2 : 



0 

1 

0 

0 

0 

1 

0 

1 




0 

1 

2 

0 

0 

0 

0 

1 

0 

1 

2 

2 

0 

2 

1 


4. Construction of GF{ 2 2 ) = F 4 : 

GF( 2 2 ) = Z 2 [x]/(a: 2 + x + 1) = { Cia + Co | ci, c 0 € Z 2 } = {0, 1, a, a + 1} 

where a is a zero of x 2 + x + 1; i.e., a 2 + a + 1 = 0. The nonzero elements of GF(p n ) 
can also be written as powers of a as a, a 2 = — a — 1 = a + 1, a 3 = a ■ a 2 = a(a + 1) = 
a 2 + a = (a + 1) + a = 2a + 1 = 1. 
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Thus, GF( 2 2 ) = {0,1, a, a 2 } has the following addition and multiplication tables: 


+ 

0 

1 

a 

a 2 

0 

0 

1 

a 

a 2 

1 

1 

0 

a 2 

a 

a 

a 

Q 2 

0 

i 

a 2 

a 2 

a 

1 

0 



0 

1 

a 

a 2 

0 

0 

0 

0 

0 

1 

0 

1 

a 

a 2 

a 

0 

a 

a 2 

1 

a 2 

0 

a 2 

1 

a 


5. Construction of GF( 2 3 ) = F s : Let f{x) =i 3 + i+ lS Z 2 [x] and let a be a root 
of f(x). Then GF( 2 3 ) = { C 2 « 2 + C\a + Co | Co, Ci, C 2 £ Z 2 } where a 3 + a + 1 = 0. 

The elements of GF( 2 3 ) (using a as generator) are: 

0, a, a 2 , a 3 = a + 1 

a 4 = a 2 + a, a 5 = a 2 + a + 1, a 6 = a 2 + 1, 1 (= a 7 ). 

Multiplication is carried out using the ordinary rules of exponents and the fact that 
a 7 = 1. The following Zech logarithm values can be used to construct the table for 
addition: Z(l) = 3,Z(2) = 6, Z(3) = 1,Z(4) = 5, Z(5) = 4,Z(6) = 2,Z(7) = 0. For 
example a 3 + a 5 = a 3 ■ a z( ' 5 ~ 3 ' ) = a 3 • a 6 = a 9 = a 2 . 

Using strings of Os and Is to represent the elements, 0 = 000, 1 = 001, a = 010, 
a + 1 = Oil, a 2 = 100, a 2 + a = 110, a 2 + 1 = 101, a 2 + a + 1 = 111, yields the 
following tables for addition and multiplication: 


+ 

000 

001 

010 

Oil 

100 

101 

110 

111 

000 

000 

001 

010 

Oil 

100 

101 

110 

111 

001 

001 

000 

Oil 

010 

101 

100 

111 

110 

010 

010 

Oil 

000 

001 

110 

111 

100 

101 

Oil 

Oil 

010 

001 

000 

111 

110 

101 

100 

100 

100 

101 

110 

111 

000 

001 

010 

Oil 

101 

101 

100 

111 

110 

001 

000 

Oil 

010 

110 

110 

111 

100 

101 

010 

Oil 

000 

001 

111 

111 

110 

101 

100 

Oil 

010 

001 

000 



000 

001 

010 

Oil 

100 

101 

110 

111 

000 

000 

000 

000 

000 

000 

000 

000 

000 

001 

000 

001 

010 

Oil 

100 

101 

110 

111 

010 

000 

010 

100 

110 

Oil 

001 

111 

101 

Oil 

000 

Oil 

110 

101 

111 

100 

001 

010 

100 

000 

100 

Oil 

111 

110 

010 

101 

001 

101 

000 

101 

001 

100 

010 

111 

Oil 

110 

110 

000 

110 

111 

001 

101 

Oil 

010 

100 

111 

000 

111 

101 

010 

001 

110 

100 

Oil 


The same field can be constructed using g(x) = x 3 + x 2 + 1 instead of /( x) = 
x 3 + x + 1 and (3 as a root of g(x) (/ 6 3 + (3 2 + 1 = 0) . The elements (using (3 as generator) 
are: 0, (3, (3 2 , (3 3 = (3 2 + 1, f3 4 = (3 2 + (3 + 1, /3 5 = /3 + 1, (3 & = (3 2 + /?, 1 (= (3 7 ). 

The polynomial g(x) yields the following Zech logarithm values, which can be used 
to construct the table for addition: Z{ 1) = 5,Z(2) = 3,Z(3) = 2,Z(4) = 6,Z(5) = 
1, Z( 6) = 4, Z( 7) = 0. This field is isomorphic to the field defined using /( x) = x 3 +a;+l. 

6. Table 1 lists the irreducible polynomials of degree at most 8 in Z 2 [x\. For more 
extensive tables of irreducible polynomials over certain finite fields, see [LiNi94]. 
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Table 1 Irreducible polynomials in Z 2 [x] of degree at most 8. 

Each polynomial is represented by the string of its coefficients, beginning with the 
highest power. For example, x 3 + x + 1 is represented by 1011. 


degree 

1: 

10 

11 





degree 

2: 

111 






degree 

3: 

1011 

1101 





degree 

4: 

10011 

11001 

mil 




degree 

5: 

100101 

101001 

101111 

110111 

moil 

111101 

degree 

6: 

1000011 

1001001 

1010111 

1011011 

1100001 

1100111 



1101101 

1110011 

1110101 




degree 

7: 

10000011 

10001001 

10001111 

10010001 

10011101 

10100111 



10101011 

10111001 

10111111 

11000001 

11001011 

11010011 



11010101 

11100101 

11101111 

11110001 

11110111 

11111101 

degree 

8: 

100011011 

100011101 

100101011 

100101101 

100111001 

100111111 



101001101 

101011111 

101100011 

101100101 

101101001 

101110001 



101110111 

101111011 

110000111 

110001011 

110001101 

110011111 



110100011 

110101001 

110110001 

110111101 

111000011 

111001111 



111010111 

111011101 

111100111 

111110011 

111110101 

111111001 


5.7 LATTICES 


5.7.1 BASIC CONCEPTS 
Definitions: 

A lattice (L,V, A) is a nonempty set L closed under two binary operations V (join) 
and A (meet) such that the following laws are satisfied for all a,b,c £ L: 

• associative laws: aV (bV c) = (a V b) V c a A (b A c) = (a A b) A c 

• commutative laws: a V b = b\/ a a A b = b A a 

• absorption laws: a V (a A b) = a a A (a V b) = a. 

Lattices L\ and L 2 are isomorphic (as lattices) if there is a function ip: L\ — > L 2 
that is one-to-one and onto L 2 and preserves V and A: ip(a V b) = ip(a) V ip(b) and 
(p(a A b) = ip(a) A p(b) for all a, b £ L\. 

L\ is a sublattice of lattice L if L\ C L and L\ is a lattice using the same operations 
as those used in L. 

The dual of a statement in a lattice is the statement obtained by interchanging the 
operations V and A and interchanging the elements 0 (lower bound) and 1 (upper 
bound). (See §5.7.2.) 

An order relation < can be defined on a lattice so that a < b means that a V b = b, 
or, equivalently, that a A b = a. Write a < b if a < b and a^b. (See §2.7.1.) 
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Facts: 

1. If L is a lattice and a,b £ L, then a Ab and a V b are unique. 

2. Lattices as partially ordered sets : Every lattice is a partially ordered set using the 
order relation <. (See §1.4.3; also see Chapter 11 for extended coverage.) 

3. Every partially ordered set L in which gib {a, b} and lub {o, b} exist for all a,b £ L 
can be regarded as a lattice by defining a V b = lub {a, b} and a A b = gib {a, b}. 

4. The duality principle holds in all lattices: If a theorem is the consequence of the 
definition of lattice, then the dual of the statement is also a theorem. 

5. Lattice diagrams: Every finite lattice can be pictured in a poset diagram (Hasse 
diagram), called a lattice diagram. 

6. Idempotent laws: a V a = a and a A a = a for all a £ L. 

Example: 

1. The following table gives examples of lattices. 


set 

V (join) 

A (meet) 

A f 

a V b = 1cm {a, b} 

a A b = gcd{a, b} 

AT 

a V b = max {a, b} 

a A b = min {a, b} 

7n 

(ttl, • • • 5 &n) V (&i, • • • 5 ^n) — 

(max(ai, 6 i), . . . , max (a n , b n )) 

(ttl, • • • ? G/n) V ( 61 , • • • 5 ^n) — 

(min (ai,&i), . . . ,min (a n , 6 n )) 

all subgroups 
of a group G 

H\ V H 2 = the intersection of 
all subgroups of G containing 
Hi and H 2 

Hi A H 2 = Hi n h 2 

all subsets of set S 

Ai V A 2 = A ± U A 2 

A\ A .A 2 = Ali n A . 2 


5.7.2 SPECIALIZED LATTICES 
Definitions: 

A lattice L is distributive if the following are true for all a,b,c £ L: 

• a A (b V c) = (a A b) V (a A c); 

• a V (b A c) = (a V b) A (a V c). 

A lower bound ( smallest element , least element ) in a lattice L is an element 0 € L 
such that 0 A a = 0 (equivalently, 0 < a) for all a £ L. 

An upper bound ( largest element , greatest element ) in a lattice L is an element 
1 £l such that 1 V a = 1 (equivalently, a < 1) for all a £ L. 

A lattice L is bounded if L contains a lower bound 0 and an upper bound 1. 

A lattice L is complemented if: 

• L is bounded; 

• for each a £ L there is an element b £ L (called a complement of a) such that 

a V b = 1 and a A b = 0. 

An element a in a bounded lattice L is an atom if 0 < a and there is no element b £ L 
such that 0 <b < a. 
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Facts: 

1. Each of the distributive properties in a lattice implies the other. 

2 . Not every lattice is distributive. (See Example 1.) 

3 . If a lattice is not distributive, it must contain a sublattice isomorphic to one of the 
two lattices in the following figure. 



5 . Some infinite lattices are bounded, while others are not. (See Examples 2 and 3.) 

6 . In a complemented lattice, complements are not necessarily unique. See the lattice 
in Example 4. 

7 . If L is a finite, complemented, distributive lattice and a £ L, then there is exactly 
one set of atoms {oi, . . . , a*,} such that a = Oi V • • • V a*,. 


Examples: 

1. Neither lattice in Fact 3 is distributive. For example, in lattice Li, 
d V (6 A c) = d, but (d V b) A (d V c) = b 

and in L 2 , 


d V (b A c) = d, but (d V b) A (d V c) = a. 


2 . The lattice (J\f, V, A) where a V b = max (a, b) and a A b = min (a, b) is not bounded; 
there is a lower bound (the integer 0 ), but there is no upper bound. 

3 . The following infinite lattice is bounded. The element 1 is an upper bound and the 
element 0 is a lower bound. 



4 . The lattice in Example 3 is complemented, but complements are not unique in that 
lattice. For example, the element aq has 02 , 03 ,... as complements. 

5 . In lattice L\ of Fact 3, b and c are atoms. In the lattice of all subsets of a set S (see 
Example 1), the atoms are the subsets of S of size 1. 
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5.8 BOOLEAN ALGEBRAS 


Boolean algebra is a generalization of the algebra of sets and the algebra of logical 
propositions. It forms an abstract model of the design of circuits. 


5.8.1 BASIC CONCEPTS 


Definition: 


A Boolean algebra ( B , 0, 1) consists of a set B closed under two binary opera- 

tions, + ( addition ) and • ( multiplication ), and one monadic operation, ' (complementa- 
tion), and having two distinct elements, 0 and 1, such that the following laws are true 
for all a, b, c € B: 


• commutative laws: 

• distributive laws: 

• identity laws: 

• complement laws: 


a + b = b + a 

a ■ (b + c) = (a • b) + (a ■ c) 
a + 0 = a 
a + a' = 1 


a ■ b = b ■ a 

a + (b ■ c) = (a + b) ■ (a + c) 
a ■ 1 = a 
a ■ a! = 0. 


(George Boole, 1813-1864) 


Notes: It is common practice to omit the “ • ” symbol in a Boolean algebra, writing ab 
instead of a ■ b. The complement operation is also written using an overline: x' = x. 
By convention, complementation is done first, then multiplication, and finally addition. 
For example, a + be' means a + ( b(c' )). 

The dual of a statement in a Boolean algebra is the statement obtained by interchanging 
the operations + and • and interchanging the elements 0 and 1 in the original statement. 

Boolean algebras B\ and B 2 are isomorphic (as Boolean algebras) if there is a function 
ip: B\ — > 1?2 that is one-to-one and onto B 2 such that for all a, b € By. 

• ip(a + b) = (p(a) + p(b)\ 

• p(ab) = <p(a)ip(b)\ 

• V>(a') = ¥>(ay. 

An element a / 0 in a Boolean algebra is an atom if the following holds: if xa = x, 
then either x = 0 or x = a; that is, if x < a, then either x = 0 or x = a (see Fact 1). 
The binary operation NAND, written | , is defined by a \ b = (ab)' . 

The binary operation NOR, written | , is defined by a J, b = (a + b)'. 

The binary operation XOR, written ® , is defined by a ® b = ab' + a'b. 


Facts: 

1. Every Boolean algebra is a bounded, distributive, complemented lattice where a\/b = 
a + b and a A b = ab. Hence, every Boolean algebra is a partially ordered set (where 
a < b if and only if a + b = b, or, equivalently, ab = a or a' + b = 1 or ab' = 0). 

2. The duality principle holds in all Boolean algebras: if a theorem is the consequence 
of the definition of Boolean algebra, then the dual of the theorem is also a theorem. 
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3. Structure of Boolean algebras: Every finite Boolean algebra is isomorphic to {0, 1}” 
for some positive integer n. Hence every finite Boolean algebra has 2™ elements. The 
atoms are the n n-tuples of Os and Is with a 1 in exactly one position. 

4. If B is a finite Boolean algebra and b £ B (b ^ 0), there is exactly one set of atoms 
a± ,a,k such that b = ai + • • • + a^. 

5. If a Boolean algebra B has n atoms, then B has 2" elements. 

6 . The following laws are true in all Boolean algebras B , for all a,b,c £ B: 

• associative laws: a + (b + c) = (a + b) + c, a(bc) = ( ab)c 

(Hence there is no ambiguity in writing a + b + c and abc.) 

• idempotent laws: a + a = a, aa = a 

• absorption laws: a(a + b) = a, a + ab = a 

• domination ( boundedness ) laws: a + 1 = 1, aO = 0 

• double complement ( involution ) law: (a , ) / = a 

• DeMorgan’s laws: (a + b)' = ci'b ' , (ab)' = a' + b' 

• uniqueness of complement: if a + b = 1 and ab = 0 , then b = a' . 

7. Since every Boolean algebra is a lattice, every finite Boolean algebra can be pictured 
using a partially ordered set diagram. (§ 11 . 1 ) 


Examples: 

1. {0, 1} is a Boolean algebra, where addition, multiplication, and complementation 
are defined in the following tables: 


+ 

0 1 

0 

1 

0 1 

1 1 



0 1 

0 

1 

0 0 

0 1 


X 

x' 

0 

1 

1 

0 


2. If S is any set, then V(S) (the set of all subsets of S) is a Boolean algebra where 

A\ + Ai = A\ U A 2 , A\ ■ A 2 = A\ n A 2 , A' = A 
and 0 = 0 and 1 = S. 

3. Given n variables, the set of all compound propositions in these variables (identified 
with their truth tables) is a Boolean algebra where 

p + q = p\J q, p-q = pAq , p = =p 

and 0 is a contradiction (the truth table with only values F) and 1 is a tautology (the 
truth table with only values T). 

4. If B is any Boolean algebra, then B n = { (ax , . . . , a n ) \ ai £ B for all i } is a Boolean 
algebra, where the operations are performed coordinatewise: 

(cii, • • • 5 u n ) T (^i, - • * , b n ) — (ax +bx, , a n T b n ), 

(gi , . . . , a n ) • (b \ , . . . , bn) (a± bx , . . . , a n * b n ) , 

(ax,...,a n y = a'J. 

In this Boolean algebra 0 = (0, . . . , 0) and 1 = (1, . . . , 1). 

5. The statements in each of the following pairs are duals of each other: 

a + b = cd, ab = c + d\ 

a+ (b + c) = (a + b) + c, a(bc) = ( ab)c ; 

a + 1 = 1, aO = 0. 
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5.8.2 BOOLEAN FUNCTIONS 


Definitions: 

A Boolean expression in the variables Xi, ... ,x n is an expression defined recursively 
by: 

• 0, 1, and all variables Xi are Boolean expressions in xi, . . . , x n ; 

• if E and F are Boolean expressions in the variables x ±, . . . , x n , then ( EF ), (E+F), 

and E' are Boolean expressions in the variables x±, . . . ,x n . 

A Boolean function of degree n is a function /: {0, l} n — > {0, 1}. 

A literal is a Boolean variable or its complement. 

A minterm of the Boolean variables x \ , . . . , x n is a product of the form y\. . .y n where 
for each i, y.i is equal to Xi or x \. 

A maxterm of the Boolean variables xi, . . . , x n is a sum of the form yi + ■ ■ ■ + y n where 
for each i, y.i is equal to Xj or x\. 

A Boolean function of degree n is in disjunctive normal form ( DNF ) (or sum-of- 
products expansion ) if it is written as a sum of distinct minterms in the variables 
Xi,... , x n . (Note: disjunctive normal form is sometimes called full disjunctive normal 
form.) 

A Boolean function is in conjunctive normal form (CNF) (or product-of-sums 
expansion) if it is written as a product of distinct maxterms. 

A set of operators in a Boolean algebra is functionally complete if every Boolean 
function can be written using only these operators. 


Facts: 

1. Every Boolean function can be written as a Boolean expression. 

2. There are 2 2 Boolean functions of degree n. Examples of the 16 different Boolean 
functions with two variables, x and y, are given in the following table. 


X 

y 

l 

x + y 

x + y' 

x' + y 

x\y 

X 

y 

x®y 

(x © y)' 

y' 

x' 

xy 

xy ' 

x'y 

x i y 

0 

1 

l 

T 

1 

1 

1 

0 

l 

l 

0 

1 

0 

0 

1 

0 

0 

0 

~o~ 

1 

0 

l 

1 

1 

0 

1 

l 

0 

1 

0 

1 

0 

0 

1 

0 

0 

0 

0 

l 

l 

1 

0 

1 

1 

0 

l 

1 

0 

0 

1 

0 

0 

1 

0 

0 

0 

0 

l 

0 

1 

1 

1 

0 

0 

0 

1 

1 

1 

0 

0 

0 

1 

0 


3. Every Boolean function (not identically 0) can be written in disjunctive normal 
form. Either of the following two methods can be used: 

(a) Rewrite the expression for the function so that no parentheses remain. For each 

term that does not have a literal for a variable Xi , multiply that term by Xi + x(. 
Multiply out so that no parentheses remain. Use the idempotent law to remove 
any duplicate terms or duplicate factors. 

(b) Make a table of values for the function. For each row where the function has the 

value 1, form a minterm that yields 1 in only that row. Form the sum of these 
minterms. 

4. Every Boolean function (not identically 1) can be written in conjunctive normal 
form. Any of the following three methods can be used: 
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(a) Write the negation of the expression in disjunctive normal form. Use DeMorgan’s 

laws to take the negation of this expression. 

(b) Make a table of values for the function. For each row where the function has the 

value 0, form a minterm that yields 1 in only that row. Form the sum of these 
minterms. Use DeMorgan’s laws to take the complement of this sum. 

(c) Make a table of values for the function. For each row where the function has the 

value 0, form a maxterm that yields 0 in only that row. Form the product of 
these maxterms. 


5. The following are examples of functionally complete sets, with explanations showing 
how any Boolean function can be written using only these operations: 


•{+,'} 

• m 

• in 


disjunctive normal form uses only the operators +, • , and ' 
DeMorgan’s law (a ■ b)' = a' + b' allows the replacement of any 
occurrence of a ■ b with an expression that does not use • 
DeMorgan’s law a + b = (a! ■ b')' allows the replacement of any 
occurrence of a + b with an expression that does not use + 
write the expression for any function in DNF ; use a' = a \ a, 
a + b = (a | a) \ (b \ b ) , and a ■ b = (a \ b) \ (a \ b) to replace each 
occurrence of ’ , + , and • with | 
write the expression for any function in DNF; use a' = a J, a, 
a + b = (a lb) l (a lb), and a ■ b = (a | a) j (b J, b) to replace 
each occurrence of ' , + , and • with j . 


6. The set { + , • } is not functionally complete. 


Examples: 

1. The function /: {0, l} 3 — » {0, 1} defined by f(x, y, z) = x(z r + y'z) + x' is a Boolean 
function in the Boolean variables x, y, z. Multiplying out the expression for this function 
yields f(x, y, z) = xz' + xy'z + x' . In this form the second term, xy'z, is a minterm in 
the three variables x , y, z. The first and third terms are not minterms: the first term, 
xz' , does not use a literal for y, and the third term, x' , does not use literals for y and z. 

2. Writing a Boolean function in disjunctive normal form: To write the function / 
from Example 1 in DNF using Fact 3(a), replace the terms xz' and x' with equivalent 
minterms by multiplying these terms by 1 (= a + a') for each missing variable a: 

xz' = xz' ■ 1 = xz' (y + y') = xyz' + xy'z 
x’ = x' ■ 1 • 1 = x\y + y'){z + z') = x'yz + x'yz' + xy'z + xy'z' . 

Therefore, 

f{x, y, z) = x(z' + y'z) + x' 

= xz' + xy'z + x' 

= xyz' + xy'z' + xy'z + x'yz + x'yz' + x'y'z + xy'z' 
Alternatively, using Fact 3(b), the table of values for / yields 1 in all rows except 
the row in which x = y = z = 1. Therefore minterms are obtained for the other rows, 
yielding the same sum of seven minterms. 

3. Writing a Boolean function in conjunctive normal form: Using Fact 4(a) to write 
the function f(x, y) = xy' +x'y in CNF, first rewrite the negation of / in DNF, obtaining 
f'(x, y) = xy + x'y'. The negation of f is f"(x, y) = f(x, y) = (x 1 + y')(x + y). 

Alternatively, using Fact 4(c), the function / has value 0 only when x = y = 1 and 
x = y = 0. The maxterms that yield 0 in exactly one of these rows are x' + y’ and x + y. 
Therefore, in CNF f(x, y) = (x 1 + y')( x + y). 
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5.8.3 LOGIC GATES 


Boolean algebra can be used to model circuitry, with Os and Is as inputs and outputs. 
The elements of these circuits are gates that implement the Boolean operations. 

Facts: 

1 . The following figure gives representations for the three standard Boolean operators, 
+ , • , and ' , together with representations for three related operators. (For example, 
the AND gate takes two inputs, x and y, and produces one output, xy.) 

RND gate OR gate NOT gate (inuerter) 



AND gate with n inputs OR gate with n inputs 


2. Gates can be extended to include cases where there are more than two inputs. The 
figure of Fact 1 also shows an AND gate and an OR gate with multiple inputs. These 
correspond to X\X 2 ■ ■ - x n and X\ + x^ + ■ ■ ■ + x n . (Since both operations satisfy the 
associative laws, no parentheses are needed.) 

Examples: 

1. The gate diagram for a half-adder : A half-adder is a Boolean circuit that adds two 
bits, x and y, producing two outputs: 

a sum bit s = (x + y)(xy)' (s = 0 if x = y = 0 or x = y = 1; s = 1 otherwise); 
a carry bit c = xy (c = 1 if and only if x = y = 1). 

The gate diagram for a half-adder is given in the following figure. This circuit is an 
example of a multiple output circuit since there is more than one output. 



© 2000 by CRC Press LLC 









2. The gate diagram for a full-adder: A full-adder is a Boolean circuit that adds three 
bits (x, y, and a carry bit c) and produces two outputs (a sum bit s and a carry bit c'). 
The full-adder gate diagram is given in the following figure. 



5.8.4 MINIMIZATION OF CIRCUITS 

Boolean expressions that appear to be different can yield the same combinatorial circuit. 
For example, xyz+xyz'+x'y and y (as functions of x and y) have the same table of values 
and hence yield the same circuit. (The first expression can be simplified to give the 
second: xyz+xyz'+x'y = xy(z+z')+x'y = xy-l+x'y = xy+x'y = ( x+x')y = 1 y = y.) 

Definitions: 

A Boolean expression is minimal (as a sum-of-products) if among all equivalent sum- 
of-products expressions it has the fewest number of summands, and among all sum- 
of-products expressions with that number of summands it uses the smallest number of 
literals in the products. 

A Karnaugh map for a Boolean expression written in disjunctive normal form is a 
diagram (constructed using the following algorithm) that displays the minterms in the 
Boolean expression. 

Facts: 

1. Minimization of circuits is an NP-hard problem. 

2 . Don’t care conditions: In some circuits, it may be known that some elements of the 
input set for the Boolean function will never be used. Consequently, the values of the 
expression for these elements is irrelevant. The values of the circuit function for these 
unused elements of the input set are called don ’t care conditions , and the values can be 
arbitrarily chosen to be 0 or 1. The blocks in the Karnaugh map where the function 
values are irrelevant are marked with d. In the simplification process of the Karnaugh 
map, Is can be substituted for any or all of the ds in order to cover larger blocks of 
boxes and achieve a simpler equivalent expression. 

Algorithm: 

There is an algorithm for minimizing Boolean expressions by systematically grouping 
terms together. When carried out visually, the method uses a Karnaugh map (Maurice 
Karnaugh, born 1924). When carried out numerically using bit strings, the method is 
called the Quine-McCluskey method (Willard Quine, born 1908, Edward McCluskey, 
born 1929). 

1 . Karnaugh map: To minimize a Boolean expression: 

(a) Write the Boolean expression in disjunctive normal form. 

(b) Obtain the Karnaugh map for this Boolean expression. The layout of the table 
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depends on the number of variables under consideration. 

The grids for Boolean expressions with two variables ( x and y), three variables 
(x, y, and z), and four variables ( w , x, y, and z ) are shown in the following figure. 
Each square in each grid corresponds to exactly one minterm — the product of the row 
heading and the column heading. For example, the upper right box in the grid of part 
(a) of the figure represents the minterm xy 1 ; the lower right box in the grid of part (c) 
of the figure represents w'xyz'. 



(a) (b) (c) 

The headings are placed in a certain order adjacent squares in any row (or 
column) differ in exactly one literal in their row headings (or column headings). The 
first and last squares in any row (or column) are to be regarded as adjacent. (The 
variable names can be permuted; for example, in part (b) of the figure, the row headings 
can be y and y' and the column headings can be xz, xz', x'z', and x'z. The column 
headings could also have been written in order as yz, y'z, y'z' , yz' or y'z, y'z\ yz', yz.) 

The Karnaugh map for the Boolean expression is obtained by placing a checkmark 
in each square corresponding to a minterm in the expression. 

(c) Find the best covering. A geometric version of the distributive law is used to “cover” 
groups of the adjacent marked squares, with every marked square covered at least once 
and each group covered being as large as possible. The possible ways of covering squares 
depends on the number of variables. 

(For example, working with two variables and using the distributive law, x'y+x'y' = 
x'(y + y 1 ) = x'l = x' . This corresponds to covering the two boxes in the bottom row of 
the first 2x2 grid in the following figure and noting that the only literal common to 
both boxes is x' . 
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Similarly, working with three variables, xyz' + xy' z' + x'yz' + x'y' z' = xz' (y + y') + 
x'z'(y + y') = xz' + x' z' = (x + x')z' = z' . This corresponds to covering the four boxes 
in the second and third columns of the third 2x4 grid in the second row of the figure 
and noting that z' is the only common literal.) 

The following table shows what groups of boxes can be covered, for expressions 
with 2, 3, and 4 variables. These are the combinations whose expressions can be simpli- 
fied to a single minterm. Examples for 2, 3, and 4 variables are shown in the previous 
figure. (The method is awkward to use when there are more than 4 variables.) 


# variables 

groups of boxes that can be covered 

2 

lxl, 1x2, 2x1, 2x2 

3 

lxl, 1x2, 1x4, 2x1, 2x2, 2x4 

4 

lxl, 1x2, 1x4, 2x1, 2x2, 2x4, 4x1, 4x2, 4x4 


To obtain the minimization, cover boxes according to the following rules: 

• cover all marked boxes at least once 

• cover the largest possible blocks of marked boxes 

• do not cover any unmarked box 

• use the fewest blocks possible. 

(d) Find the product of common literals for each of the blocks and form the sum of 
these products to obtain the minimization. 

2. Quine-McCluskey method : 

(a) Write the Boolean expression in disjunctive normal form, and in each summand list 
the variables in alphabetical order. Identify with each term a bit string, using a 1 if the 
literal is not a complement and 0 if the literal is a complement. (For example, v'wx'yz 
is represented by 01011.) 

(b) Form a table with the following columns: 

column 1: Make a numbered list of the terms and their bit strings, beginning with 
the terms with the largest number of uncomplemented variables. (For example, 
wxy' z precedes wx'yz ' .) 

column 2: Make a list of pairs of terms from column 1 where the literals in the 
two terms differ in exactly one position. Use a distributive law to add and 
simplify the two terms and write the numbers of these terms and the sum of 
the terms in the second column, along with its bit string, using ” in place of 
the variable that no longer appears in the sum. (For example, xyz' and xy' z' 
can be combined to yield xz' with bit string 1— 0.) 

columns 3, 4, etc.: To obtain column 3 combine the terms in column 2 in pairs 
according to the same procedure as that used to construct column 2. Repeat 
this process until no more terms can be combined. 

(c) Form a table with a row for each of the terms that cannot be used to form terms with 
fewer variables and a column for each of the original terms in the disjunctive normal 
form of the original expression. Mark the square in the y-position if the minterm in 
column j could be a summand for the term in row i. 

(d) Find a set of rows, with as few rows as possible, such that every column has been 
marked at least once in at least one row. The sum of the products labeling these rows 
minimizes the original expression. 
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Examples: 


1. Simplify w'x'y + w' z{xy + x'y') + w'x' z' + w'xyz' + wx'y' z' (an expression in four 
variables) using a Karnaugh map. 


First write the expression in disjunctive normal form: 


w'x'y + w'z(xy + x'y') + w'x' z' + w'xyz' + wx'y' z' = 

w'x'yz + w'x'yz' + w'xyz + w’x'y' z + w'x'y' z' + w'xyz' + wx'y' z' 


Next, draw its Karnaugh map. See part (a) of the following figure. A covering is 
given in part (b) of the figure. Note that in order to use larger blocks, some squares 
have been covered more than once. Also note that w'x'yz, w'xyz, w'x'yz' , and w'xyz' 
are covered with one 2x2 block rather than with two 1x2 blocks. In the three blocks 
the common literals are: w'x' , w'y , and x'y' z' . 

Finally, form the sum of these products: w'x' + w'y + x'y' z' . 



yz 

y'z 

y'z' 

yz' 

WX 





wx' 



~7 


w’x' 

V 

~7 

y 

T 

w'x 

y 



y 


wx 

wx' 

w'x' 

w'x 


yz y'z y'z' yz' 











y 

vO 

2 

3 





(a) 


(b) 


2. Minimize w' xy' z + wxyz' + wx'y z' + w'x'y z + wxyz + w'x'y' z + w'xyz (an expression 
in four variables) using the Quine-McCluskey method. 

Step (b) of the Quine-McCluskey method yields the following table. 


i 

wxyz 

1111 

1,2 

wxy 

111- 

3, 5, 6, 7 w'z 0 — 1 

2 

wxyz' 

1110 

1,3 

xyz 

-111 


3 

w'xyz 

0111 

2,4 

wyz ' 

1-10 


4 

wx'yz' 

1010 

3,5 

w'yz 

0-11 


5 

w'x'yz 

0011 

3,6 

w'xz 

01-1 


6 

w'xy'z 

0101 

5,7 

w'x' z 

00-1 


7 

w'x'y' z 

0001 

6,7 

w'y' z 

0-01 



The four terms w'z, wxy, xyz, wyz' were not used in combining terms, so they become 
the names of the rows in the following table. 



wxyz 

wxyz' 

w'xyz 

wx’yz’ 

w'x'yz 

w'xy 1 z 

w'x'y' z 

w'x 



V 


V 

V 

V 

wxy 

V 

V 






xyz 

V 


V 





wyz' 


V 


V 





There are two ways to cover the seven minterms: 

w'x, wxy, wyz' or w'x, xyz, wyz' . 
This yields two ways to minimize the original expression: 

w'x + wxy + wyz' and w' z + xyz + wyz' . 
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INTRODUCTION 


Concepts from linear algebra play an important role in various applications of discrete 
mathematics, as in coding theory, computer graphics, generation of pseudo-random 
numbers, graph theory, and combinatorial designs. This chapter discusses fundamental 
concepts of linear algebra, computational aspects, and various applications. 


GLOSSARY 

access (of a class): The class C,; of vertices has access to class Cj if either i = j or 
there is a path from a vertex in C, to a vertex in Cj. 

adjoint: See Hermitian adjoint. 

algebraic multiplicity : given an eigenvalue, the multiplicity of the eigenvalue as a 
root of the characteristic equation. 

augmented matrix (of a linear system): the matrix obtained by appending the right- 
hand side vector to the coefficient matrix as its rightmost column. 

back substitution: a procedure for solving an upper triangular linear system. 

basic class (of a matrix): a class such that the Perron root of the corresponding 
principal submatrix equals that of the entire matrix. 

basis: an independent spanning set of vectors in a vector space. 

characteristic equation: for a square matrix A, the equation pA ( A) = 0, where pa( A) 
is the characteristic polynomial of A. 

characteristic polynomial: for a square matrix A , the polynomial (in the indefinite 
symbol A) given by pa(^) = det(A I — A). 

Cholesky decomposition: expressing a matrix A as A = LL T , where L is lower 
triangular and every entry on the main diagonal of L is positive. 

circulant: a matrix in which every row is obtained by a single cyclic shift of the 
previous row. 

class (of a matrix) : a maximal set of row indices such that the corresponding vertices 
have mutual access in the directed graph of the matrix. 

complete pivoting: an implementation of Gaussian elimination in which a pivot of 
largest magnitude is selected at each step. 

condition number: given a matrix A , the number k(A) = ||Al|| ||A _1 ||. 

conjugate sequence (of a sequence): the sequence whose nth term is the number of 
terms not less than n in the given sequence. 

dependent set: a set of vectors in a vector space that are not independent. 

determinant: given an n x n matrix A , det A = Ylaes s g n ( CT ) a io-(i) 02 cr( 2 ) • • -QW(rO> 
where S n is the symmetric group on n elements and the coefficient sgn(cr) is the 
sign of the permutation a: 1 if a is an even permutation and — 1 if cr is an odd 
permutation. 

diagonal matrix: a square matrix with nonzero elements only on the main diagonal. 

diagonalizable matrix: a square matrix that is similar to a diagonal matrix. 

difference (of matrices of the same dimensions): the matrix each of whose elements 
is the difference between corresponding elements of the original matrices. 
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dimension: for a vector space V, the number of vectors in any basis for V. 

directed graph (of a matrix ^4): the graph G(A) with vertices corresponding to the 
rows of A and an edge from i to j whenever a t j is nonzero. 

direct sum (of subspaces): given subspaces U and W, the sum of subspaces in which 
U and W have only the zero vector in common. 

distance (between vectors): given vectors v and w, the length of the vector v — w. 

dominant eigenvalue : given a matrix, an eigenvalue of the matrix of maximum mod- 
ulus. 

dot product (of real vectors): given real vectors x = (aq, . . . , x „ ) and y = (j/i, . . . , y n ), 

n 

the number x • y = x iUi- 
i= 1 

doubly stochastic matrix : a matrix with all entries nonnegative and with all row 
and column sums equal to 1. 

eigenvalue: given a square matrix A , a scalar A such that Ax = \x for some nonzero 
vector x. 

eigenvector: given a square matrix A , a nonzero vector x such that the vector Ax is 
a scalar multiple of x. 

eigenspace: given a square matrix A, the vector space {x \ Ax = Ax} for some 
scalar A. 

exponent (of a matrix): given a matrix A, the least positive integer m, if it exists, 
such that A m has all positive entries. 

nil: in Gaussian elimination, those nonzero entries created in the triangular factors of 
a matrix corresponding to zero entries in the original matrix. 

h nal class: given a matrix, a class of the matrix with access to no other class. 

hop: a multiply-add operation involving a single multiplication followed by a single 
addition. 

forward substitution: a procedure for solving a lower triangular linear system. 

fully indecomposable matrix: a matrix that is not partly decomposable. 

Gaussian elimination: a solution procedure that at each step uses one equation to 
eliminate one variable from the system of equations. 

geometric multiplicity: the dimension of the eigenspace. 

Gersgorin discs: regions in the complex plane that collectively are guaranteed to 
contain all the eigenvalues of a given matrix. 

growth factor : a ratio that measures how large the entries of a matrix become during 
Gaussian elimination. 

Hermitian adjoint: given a matrix A , the matrix A* obtained from the transpose A T 
by replacing each entry by its complex conjugate. 

Hermitian matrix: a complex matrix whose transpose is its (elementwise) complex 
conjugate. 

idempotent matrix: a matrix A such that A 2 = A. 

identity matrix: a diagonal matrix in which each diagonal element is 1. 

ill-conditioned system: a linear system Ax = b whose solution x is extremely sensi- 
tive to errors in the data A and b. 
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independent set: a set of vectors in a vector space that is not dependent. 

index of cyclicity: for a matrix, the number of eigenvalues with maximum modulus. 

inner product: a field-valued function of two vector variables used to define a notion 
of orthogonality (that is, perpendicularity). In real or complex vector spaces it is 
also used to introduce length, distance, and convergence. 

inverse: given a square matrix A , the square matrix whose product with the 
original matrix is the identity matrix. 

invertible matrix: a matrix that has an inverse. 
irreducible matrix: a matrix that is not reducible. 

isomorphic (vector spaces): vector spaces that are structurally identical. 

kernel (of a linear transformation): the set of all vectors that are mapped to the zero 
vector by the linear transformation. 

length (of a vector): the square root of the inner product of the vector with itself. 

linear combination (of vectors): given vectors Vi, i> 2 , . . . , Vt, a vector of the form 
aiVi + CL 2 V 2 + ■ ■ ■ + a t v t , where the a,; are scalars. 

linear operator: a linear transformation from a vector space to itself. 

linear system: a set of m linear equations in n variables x, represented by Ax = b; 
here A is the coefficient matrix and b is the right-hand side vector. 

linear transformation: a function T from one vector space over F to another vector 
space over F satisfying T(au+v ) = aT(u) + T(v ) for all vectors u, v and all scalars a. 

lower triangular matrix: a matrix in which all nonzero elements occur either on or 
below the diagonal. 

LU decomposition: expressing a matrix A as the product A = LU, where L is unit 
lower triangular and U is upper triangular. 

Markowitz pivoting: a simple greedy strategy for reducing the number of nonzero 
entries introduced during the LU decomposition of a sparse matrix. 

matrix (of a linear transformation): given a linear transformation T, a matrix associ- 
ated with T that represents T with respect to a fixed basis. 

minimal polynomial: for a matrix A, the monic polynomial q(-) of minimum degree 
such that q(A) = 0. 

minimum degree algorithm: a version of the Markowitz pivoting strategy for sym- 
metric coefficient matrices. 

minor: the determinant of a square submatrix of a given matrix. 
modulus: the absolute value of a complex number. 

nilpotent matrix: a matrix A such that A k = 0 for some positive integer k. 
nonnegative matrix: a matrix with each entry nonnegative. 
nonsingular matrix: a matrix that has an inverse. 

normal matrix: a matrix A such that AA* = A* A ( A * is the Hermitian adjoint of A). 

nullity (of a linear transformation): the dimension of the kernel of the linear transfor- 
mation. 

nullity (of a matrix): the dimension of the null space of the matrix. 
null space (of a matrix A): the set of all vectors x for which Ax = 0. 
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numerically stable algorithm: an algorithm whose accuracy is not greatly harmed 
by roundoff errors. 

numerically unstable algorithm: an algorithm that can return an inaccurate solu- 
tion even when the solution is relatively insensitive to errors in the data. 

orthogonal matrix: a real square matrix whose inverse is its transpose. 

orthogonal set (of vectors): a set of vectors in which any two distinct vectors have 
inner product zero. 

orthonormal set (of vectors): a set of unit length orthogonal vectors. 

partial pivoting: an implementation of Gaussian elimination which at step k selects 
the pivot of largest magnitude in column k. 

partly decomposable (matrix): an n x n matrix containing a zero submatrix of size 
k x (n — k) for some 1 < k < n — 1. 

permanent (of an nxn matrix A): per(A) = a i<r(i)« 2 cr( 2 ) • • • o-na{n), where S n 

is the symmetric group on n elements. 

permutation matrix: a square 0-1 matrix in which the entry 1 occurs exactly once 
in each row and exactly once in each column. 

Perron root: the spectral radius of a nonnegative matrix. 

pivot: the coefficient of the eliminated variable in the equation used to eliminate it. 

positive definite matrix: a Hermitian matrix A such that x* Ax > 0 for all x ^ 0. 

positive matrix: a matrix with each entry positive. 

positive semidefinite matrix: a Hermitian matrix A such that x* Ax > 0 for all x. 

power (of a square matrix): the square matrix obtained by multiplying the matrix by 
itself the required number of times. 

primitive matrix: a matrix with a finite exponent. 

principal minor (of a matrix) : the determinant of a principal submatrix of the matrix. 

principal submatrix (of a matrix A): the matrix obtained from A by deleting all but 
a specified set of rows and the same set of columns. 

product (of matrices): for an m x n matrix A and an n x p matrix B , the m x p 
matrix AB whose ij-entry is the scalar product of row i of A and column j of B. 

range (of a linear transformation T): the set of all vectors w for which T(v) = w has 
a solution. 

rank (of a linear transformation T): the dimension of the range of T. 

rank (of a matrix): the maximum number of linearly independent rows (or columns) 
in the matrix. 

reducible matrix: a matrix A with a t j = 0 for all i £ S, j /(S, for some set S. 

roundoff errors: the errors associated with storing and computing numbers in finite 
precision arithmetic on a digital computer. 

row stochastic matrix: a matrix with all entries nonnegative and row sums 1. 

scalar: an element of a field. 

scalar multiple (of a matrix): the matrix obtained by multiplying each element of 
the original matrix by the scalar. 

scalar product: See dot product. 
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similar matrices: square matrices A and B satisfying the equation P 1 BP = A for 
some invertible matrix P. 

singular matrix : a matrix that has no inverse. 

singular values (of a matrix A): the positive square roots of the eigenvalues of AA*, 
where A* is the Hermitian adjoint of A. 

skew-Hermitian matrix: a matrix equal to the negative of its Hermitian adjoint. 

skew-symmetric matrix: a matrix equal to the negative of its transpose. 

span (of a set of vectors): all vectors obtainable as linear combinations of the given 
vectors. 

spanning set: a set of vectors in a vector space V whose span equals V. 

sparse matrix: a matrix that has relatively few nonzero entries. 

spectral radius (of a matrix): the maximum modulus of an eigenvalue of the matrix. 

square matrix: a matrix having the same number of rows and columns. 

strictly diagonally dominant matrix: a square matrix each of whose diagonal ele- 
ments exceeds in modulus the sum of the moduli of all other elements in that row. 

strictly totally positive matrix: a matrix with all minors positive. 

submatrix (of a matrix A): the matrix obtained from A by deleting all but a certain 
set of rows and a certain set of columns. 

subspace: a vector space within a vector space. 

sum (of matrices): for two matrices of the same dimensions, the matrix each of whose 
elements is the sum of the corresponding elements of the original matrices. 

sum (of subspaces): given subspaces U and W, the subspace consisting of all possible 
sums u + w where u £ U and w £ W. 

symmetric matrix: a matrix that equals its transpose. 

term rank (of a 0-1 matrix): the maximum number of Is such that no two are in the 
same row or column. 

trace: given a square matrix, the sum of the diagonal elements of the matrix. 

transpose (of a matrix): for a matrix A, the matrix A T whose columns are the rows 
of the original matrix. 

tridiagonal matrix: a matrix whose nonzero entries are either on the main diagonal 
or immediately above or below the main diagonal. 

unitary matrix: a square matrix whose inverse is its Hermitian adjoint. 

unit triangular matrix: a (lower or upper) triangular matrix having all diagonal 
entries 1. 

upper triangular matrix: a matrix in which all nonzero elements occur either on or 
above the main diagonal. 

vector: an individual object of a vector space. 

vector space: a collection of objects that can be added and multiplied by scalars, 
always yielding another object in the collection. 

well-conditioned system: a linear system Ax = b whose solution x is relatively 
insensitive to errors in the data A and b. 

0-1 matrix: a matrix with each entry either 0 or 1. 
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6.1 VECTOR SPACES 


The concept of a “vector” comes initially from the physical world, where a vector is a 
quantity having both magnitude and direction (for example, force and velocity). The 
mathematical concept of a vector space generalizes these ideas, with applications in 
coding theory, finite geometry, cryptography, and other areas of discrete mathematics. 


6.1.1 BASIC CONCEPTS 
Definitions: 

A vector space over a field F (§5.6.1) is a triple ( V. , ©, •) consisting of a set V and two 
operations, © (vector addition) and • (scalar multiplication), such that: 

• (V,©) is an abelian group (§5.2.1); i.e., © is a function (u,v) — » u©u from V x V 
to V such that: 

(u © v) © w = u © (v © w) for all u, v, w £ V; 

there is a vector 0 such that v © 0 = v for all v £ V; 

for each v £ V there is —v £ V such that v © (— v) = 0; 
u © v = v © u for all u, v £ V; 

• the operation • is a function (a, v) — » a-v from FxV toV such that for all a,b £ F 
and u,v £ V the following properties hold: 

a ■ (b ■ v) = ( ab ) • v; 

(a + b) ■ v = (a ■ v) © (b ■ v); 
a ■ (u © v) = (a ■ u) © (a ■ v); 

1 ■ v = v. 

Here, ab and a + b represent multiplication and addition of elements a,b £ F. 

The scalars are the elements of F, the vectors are the elements of V, and the set V 
itself is often also called the vector space. 

The difference of two vectors u and v is the vector u — v = u © (—v) where —v is the 
negative of v in the abelian group (V, ©). 

Notation: While vector addition © and field addition + can be quite different, it is 
customary to use the same notation + for both. It is also customary to write av instead 
of a ■ v, and to use the symbol 0 for the additive identities of the vector space V and 
the field F. 

Facts: 

Assume that V is a vector space over F. 

1. a0 = 0 and Ou = 0 for all a £ F and v £ V. 

2. (— l)u = — v for all v £ V. 

3. If av = 0 for a £ F and v £V, then either a = 0 or v = 0. 

4. Cancellation property: For all u, v, w £ V, if u + v = w + v, then u = w. 

5. a(u — v) = au — av for all a £ F and u, v £ V. 
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Examples: 

1 . Force vectors: Forces in the plane can be represented by geometric vectors such as F\ 
and F '2 in part (a) of the following figure; addition of these vectors is carried out using 
the so-called parallelogram law. By introducing a coordinate system and locating the 
initial point of each directed line segment at the origin (0,0), each geometric vector can 
be named by its terminal point. Thus, a vector in the plane becomes a pair ( x , y ) £ TZ 2 of 
real numbers. The parallelogram law of addition translates into componentwise addition 
(part (c) of the figure), while stretching (respectively, shrinking, negating) translates to 
componentwise multiplication by a real number r > 1 (respectively, 0<r<l,r=— 1). 
Three-dimensional force vectors are similarly represented using triples ( x,y,z ) £ TZ 3 . 




(a) Addition (b) Stretching, shrinking, negating (c) Addition of components 

2. Euclidean space: Generalizing Example 1, n-dimensional Euclidean space consists 
of all ?r-tuples of real numbers lZ n = { (aq, x 2 , . . . , x n ) \ Xi £ 1Z }. 

3. If F is any field, then F n = { (xi,x 2 ,.. ■ ,x n ) \ x^ £ F} is a vector space, where 
addition and scalar multiplication are componentwise: 

(xi,x 2 , ...,x n ) + (yi,y 2 , ...,y„) = {xi + yi,x 2 + y 2 ,...,x n + y„) 
a(x i,X 2 , ...,x n ) = (axi,ax 2 , . . .,ax n ) 

where a £ F. When F = TZ, these are the vectors mentioned in Examples 1 and 2. 

4. A vector space over Z 2 : V consists of the 128 subsets of the set {1,2, ...,7} 
as represented by binary 7-tuples; for example, the subset {1,4, 5, 7} corresponds to 
(1, 0, 0, 1, 1, 0, 1) and the subset {1, 2, 3, 4} to (1, 1, 1, 1, 0, 0, 0). The operations on V are 
componentwise addition and scalar multiplication mod 2. In this vector space, the sum 
of two members of V corresponds to the symmetric difference (§1.2.2) of the associated 
sets. (This example is a special case of Example 3.) 

5. A finite affine plane over Zy. V consists of all pairs ( x,y ) where x,y £ Z§ and 
where addition and scalar multiplication are componentwise modulo 5. This special 
case of Example 3 arises in finite geometry where the 25 members of V are thought 
of as “points” and the sets of solutions to equations of the form ax + by = c (where 
a, 6, c £ Z 5 with one of a or b ^ 0) are viewed as “lines” . 

6. Infinite binary sequences: V consists of all infinite binary sequences { (si, s 2 , . . .) | 
Si £ Z 2 } where addition and multiplication are componentwise mod 2. As in Example 4, 
each s £ V may be viewed of as a subset of the positive integers, but each s may also 
be viewed as a potential “message” or “data” stream; for example, each group of 7 
consecutive members of s could represent a letter in the 7-bit ASCII code. 

7. V = F mxn , the set of all m x n matrices over F, is a vector space, where vector 
addition is the usual matrix addition and scalar multiplication is the usual scalar-by- 
matrix multiplication (§6.3.2). When m = 1, this reduces to Example 3. 
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8. Let V = E be a field and F a subfield. Then V is a vector space over F where 
vector addition and scalar multiplication are the addition and multiplication of E. In 
particular, the finite field F q of prime power order q = p n is a vector space over the 
subfield F p . 

9. Let V = F[x\, the set of all polynomials (§5.5.2) over F in an indeterminate x. 
Then V is a vector space over F, where addition is ordinary polynomial addition and 
scalar multiplication is the usual scalar-by-polynomial multiplication. 

10. For a nonempty set X and a given vector space U over F, let V denote the set of 
all functions from X to U. The sum / + g of two vectors (functions) f,g£V is defined 
by (/ + g){x) = f{x) + g(x) for all x £ X and the scalar multiplication af of a £ F by 
/ € V is defined by ( af)(x ) = af(x). (For specific cases of this general vector space, 
see §6.1.2, Examples 13-15.) 


6.1.2 SUBSPACES 
Definitions: 

A subspace of a vector space V is a nonempty subset IT of V that is a vector space 
under the addition and scalar multiplication operations inherited from V. 

The sum of two subspaces U. IT C V is the set { u+w \ u € U, w € IT }. If UC\W = {0}, 
their sum is called the direct sum , denoted U © IT. 

If A is an m x n matrix over F, the null space NS(A ) of A is { x £ F nxl | Ax = 0 }. 
The null space of A is also called the right null space when contrasted with the left 
null space LNS(A) defined by { y £ F lxm \ yA = 0 }. 

Facts: Assume that V is a vector space over F. 

1. IT C V is a subspace of V if and only if IT y^ 0 and for all a,b £ F and u, v £ IT, 
au + bv € IT. 

2. IT C V is a subspace of V if and only if IT 0 and for all a € F and u, v £ IT, 
u + v € IT and au € IT. 

3. Every subspace of V contains 0, the zero vector. 

4. The sets {0} and V are subspaces of V. 

5. The intersection of any collection of subspaces of V is a subspace of V. 

6. The sum of any collection of subspaces of V is a subspace of V. 

7. Each member of U © IT can be expressed as a sum u + w for a unique u £ U and a 
unique w £ IT. 

8. The set of solutions to a homogeneous linear equation in the unknowns xi, X 2 , ■ ■ ■ , x n 
is a subspace of F n . Namely, for any fixed ( 01 , 02 , . . . , a n ) € F n , the set IT = { x € 
F n | a\X\ + a 2 X 2 + • • • + a n x n = 0 } is a subspace of F n . 

9. The set of solutions to any collection of homogeneous linear equations in the un- 
knowns x\, x 2 , ■ • • ,x n is a subspace of F n . In particular, if IT is a subspace of F n then 
the set of all x = (aq, x 2 , • ■ • , x n ) £ F n satisfying aia:i + a 2 x 2 + • • • + a n x n = 0 for all 
(o 1; a 2 , . . . , a n ) £ IT is a subspace of V called the orthogonal complement of IT and 
denoted IT A 

10. The null space NS (A) of an m x n matrix A over F is a subspace of F nxl . 

11. The left null space LNS(A) of an m x n matrix A is a subspace of F lxm and 
equals ( NS(A T )) T where T denotes transpose. 
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Examples: 

1. The set of all 3-tuples of real numbers of the form (a, b, 2a + 3b) where a, b £ 1Z is a 
subspace of 1Z 3 . This subspace can also be described as the set of solutions ( x,y,z ) to 
the homogeneous linear equation 2x + 3y — z = 0. 

2. The set of all 4-tuples of real numbers of the form (a, —a, 0, b) where a, b £ 1Z is a sub- 
space of TZ 4 . This subspace can also be described as the set of solutions (aq, aq, x$, aq) 
to the pair of equations aq + a: 2 = 0 and a; 3 = 0. 

3. For V = -Zf , the set of all solutions to the equation x + 2y = 0 forms a subspace. It 
consists of the finite set {(0, 0), (3, 1), (1, 2), (4, 3), (2, 4)} and can also be described as 
the set of all pairs in V of the form (3a, a). The set S of solutions to x + 2y = 1, namely 
{(1, 0), (4, 1), (2, 2), (0, 3), (3,4)}, is not a subspace of V since for example (1, 0) + (4, 1) = 
(0, 1) /(£>. However S' is a “line” in the affine plane described in Example 5 of §6.1.1. 

4. In the vector space V = Z,J, the set of 7-tuples with an even number of Is is a 
subspace. This subspace can also be described as the collection of all members of V 
whose components sum to 0. 

5. Coding theory: In the vector space F n over the finite field F = GF(q ), a linear code 
(§14.2) is simply any subspace of F n . In particular, an (n, k) code is a fc-dimensional 
subspace of F n . 

6. Binary codes: A linear binary code is any subspace of the vector space F n where F 
is the finite field on two elements, GF( 2). Generalizing Example 4, the set of all binary 
n-tuples with an even number of Is is a subspace of F n and so is a linear binary code. 

7. Consider the undirected graph (§8.1) in the following figure, where the edges have 
been labeled with the integers {1, 2, . . . , 7}. Associate with this graph the vector space 
V = Z\ where, as in Example 4 (§6.1.1), each binary 7-tuple is identified with a subset 
of edges. One subspace W of V, called the cycle space of the graph, corresponds to the 
(edge-disjoint) union of cycles in the graph. For example, (1, 1, 0, 1, 0, 1, 1) £ W as it 
corresponds to the cycle 1, 2, 6, 7, 4, and so is (1, 1, 1, 0, 1, 1, 1) which corresponds to the 
edge-disjoint union of cycles 1,2,3 and 5,6,7. The sum of these two members of W is 
(0, 0, 1, 1, 1, 0, 0) which corresponds to the cycle 3, 4, 5. 


4 



8 . The set of n x n symmetric matrices (§6.3.1) over a field F is a subspace of F nxn , 
and so is the set of n x n upper triangular matrices (§6.3.1) over F. 

9. For an mxm matrix A over F and A £ F, the set W = { X £ F mxn \ AX = XX } is 
a subspace of F mxn . (This space is related to the eigenspaces of A discussed in §6.5.2.) 

10 . For a given n x n matrix A over F, the set W = { X £ F nxn | XA = AX } is a 
subspace of F nxn . (This is the space of matrices that commute with A.) 

11 . Let field £ be a vector space over subfield F, and let K denote the set of all 
elements a £ E that satisfy a polynomial equation of the form f(a ) = 0 for some 
nonzero f(x) £ F[x\. Then K is a subfield of E containing F (the field of algebraic 
elements of E over F ) and consequently is a subspace of E over F. (See §5.6.2.) 

12. For each fixed n > 1, the set of all polynomials of degree < n is a subspace of F[x\. 
(See §6.1.1 Example 9.) 
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13 . In §6.1.1 Example 10, take X = [a, b] where a, b £ 1Z with a < b, and take U = 1Z 
as a vector space over itself. The resulting V, the set of all real-valued functions on 
[a,b], is a vector space. The set C[a,b\ of continuous real-valued functions on [a,b\ is a 
subspace of V. 

14 . In §6.1.1 Example 10, take X = {1, 2, . . . , 7} and take U = Zi as a vector space 
over itself. The resulting V, the set of all functions from {1,2,..., 7} to Z 2 , can be 
thought of as the vector space of binary 7-tuples V = Z.J. 

15 . In §6.1.1 Example 10, take both X and U to be vector spaces over F. Then V is 
the vector space of all functions from X to U. The collection of those T £ V satisfying 
T(aa + bj3) = aT(a) + bT((3) for all a,b £ F and a,P € X is a subspace of V. (This 
space is the space of linear transformations considered in §6.2.) 


6.1 .3 LINEAR COMBINATIONS, INDEPENDENCE, BASIS, AND DIMENSION 
Definitions: 

If V\,V 2 ,Vt are vectors from a vector space V over F, then a vector w £ V is a 
linear combination of Vi, v^, ■ ■ ■ , v t if w = a\V\ + 02^2 + • • • + a t v t for some scalars 
at £ F. The zero vector is considered a linear combination of 0. 

For S C V, the span of S, denoted Span (.S'), is the set of all (finite) linear combinations 
of members of S; that is, Span(S') consists of all finite sums a\V\ + (Z 2 U 2 + • • • + atVt 
where Vi £ S and a, £ F. (The span of the empty set is taken to be {0}.) Span (S') is 
also called the space generated or spanned by S. (See Fact 1.) 

The row space RS(A) of an m x n matrix A over F (§6.3.1) is Span(i?i, R 2 , . . . , R m ), 
where R±, R 2 , . . . , R m are the rows of A viewed as vectors in F lxn . 

The column space CS{A) of A is Span(Ci, C 2 , ■ ■ ■ , C n ), where Ci, C 2 , ■ ■ ■ , C„ are the 
columns of A. 

A subset S C V is called a spanning set for V if Span(S') = V. 

A subset S C V is ( linearly ) independent if every finite subset {vi,V 2 , ■ ■ ■ , v t j of S 
has the property that the only scalars 0 , 1 , 02 , . . . ,dt satisfying aiUi + a 2 r '2 + ' • H = 0 
are ai = 02 = • • • = at = 0. 

A subset SC F is ( linearly ) dependent if it is not independent. 

A basis for V is an independent spanning set. 

A vector space V is Unite dimensional if it has a finite basis; otherwise, V is infinite 
dimensional. 

The dimension, dim V, of a vector space V is the cardinality of any basis for V. (See 
Fact 8.) 

If B = (i>i, i> 2 , . . . , v n ) is an ordered basis for V, then the coordinates of v with respect 
to B are the scalars «i, 02 , , . . , a n such that v = ai^i + 02^2 + • • • + a n v n . (See Fact 14.) 
The coordinate vector [v)b of v with respect to B (written as a column) is [v\s = 
(ai, a 2 , • • • , a n ) T where T denotes transpose (§6.3.1). 

Note: Some writers distinguish between the coordinates written as a row and as a 
column, calling the row ( 01 , < 22 , ... , a n ) the coordinate vector of v with respect to B and 
the column (ai, 02 , . . . , a n ) T the coordinate matrix of v with respect to B. 
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The row rank of a matrix A over F is dim RS(A'). and the column rank of A is 
dimCS 1 ©). The rank of A is the size of the largest square submatrix of A with 
nonzero determinant (§6.3.4); that is, rank A = r if there exists an r x r submatrix 
of A whose determinant is nonzero, and every t x t submatrix of A with t > r has zero 
determinant. 

The nullity of a matrix A is dim NS (A). 

Two vector spaces V and U over the same field F are isomorphic if there exists a 
bijective mapping T:V — > U such that T(v + w) = T(v) + T(w) and T(av) = aT(y) for 
all v,w £ V and a £ F. The mapping T is called an isomorphism. 

Facts: 

1. Span (,5) is a subspace of V. In particular, RS(A) is a subspace of F lxn and CS(A) 
is a subspace of F mxl . 

2. Span(S') is the intersection of all subspaces of V that contain S; thus, Span (S') 
is the smallest subspace of V containing S in that it lies inside every subspace of V 
containing S. 

3. A set {'(;} consisting of a single vector from V is dependent if and only if v = 0. 

4. A set of two or more vectors is dependent if and only if some vector in the set is a 
linear combination of the remaining vectors in the set. 

5. Any superset of a dependent set is dependent, and any subset of an independent set 
is independent. (The empty set is independent.) 

6. If V has a basis of n elements, then every subset of V with more than n elements is 
dependent. 

7. If IT is a subspace of V then dim IT < dim V. 

8. Every vector space V has a basis, and every two bases for V have the same number 
of elements (cardinality). For infinite-dimensional vector spaces, this fact relies on the 
axiom of choice (§1.2.4). 

9. Every independent subset of V can be extended to a basis for V. More generally, if S 
is an independent set, then every maximal independent set containing S' is a basis for 
V containing S. For infinite-dimensional vector spaces, this fact relies on the axiom of 
choice. (An independent set is maximal if every set properly containing it is dependent.) 

10. Every spanning set contains a basis for V. More generally, if S is a spanning set, 
then every minimal spanning subset of S is a basis for V. For infinite-dimensional vector 
spaces, this fact relies on the axiom of choice. (A spanning set is minimal if it contains 
no proper subset that spans V.) 

11. Rank-nullity theorem: If A is an m x n matrix over F, then: 

• diml?5(A) + dim NS (A) = n; 

• dim CS (A) + dim TVS'©) = n; 

• dim RS (A) + dim LNS(A) = m; 

• dim C/S (A) + dim LNS(A) = m. 

12. For every matrix A, row rank A = column rank A = rank A. Thus, the (maxi- 
mum) number of independent rows of A equals the (maximum) number of independent 
columns. 

13. The set of solutions to the m homogeneous linear equations X]j=i a ij x j = 0 in n 
unknowns has dimension n — r, where r is the rank of the m x n coefficient matrix 
A — (djj ) . 
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14 . If B is a basis for a vector space V (finite or infinite), then each v € V can 
be expressed as v = a\V\ + CI 2 V 2 + • • • + atVt , where a* £ F and Vi £ B. If v = 
bivi + b 2 i ’2 + • • • + bt.Vt is another expression for v in terms of elements of B (where 
possibly some zero coefficients have been inserted to make the two expressions have 
equal length), then a* = bi for i = 1,2 (If B is finite, this justifies the definition 
of the coordinate vector [r;]g.) 

15 . If B = (i>i,V 2 , ■ ■ ■ ,v n ) is an ordered basis for V, then the function T: V — > F nxl 
defined by T(v) = [v]b is an isomorphism, so V is isomorphic to F nxl . 

16 . Two vector spaces over F are isomorphic if and only if they have the same dimen- 
sion. 


Examples: 

1. The vector space F n has dimension n. The standard basis is the ordered basis 
(ei, 62 , . . . , e n ) where e, is the vector with 1 in position i and Os elsewhere. (The spaces 
F n , F lxn , and F nxl are isomorphic and are often identified and used interchangeably.) 

2. The vector space F mxn of m x n matrices over F has dimension mn; the standard 
basis is { Eij | 1 < i < m, 1 < j < n} where E^ is the m x n matrix with a 1 in 
position (i. j) and Os elsewhere. It is isomorphic to F mn . 

3. The subspace of 77 3 containing all 3-tuples of the form (a, 6, 2a + 36) has dimen- 
sion 2. One basis for this subspace is B\ = ((1, 0, 2), (0, 1, 3)) and another is B 2 = 
((1, 1, 5), (1, —1, —1)). The vector w = (5, —1, 7) is in the subspace since w = 5(1,0, 2) + 
( — 1) (0, 1, 3) = 2(1, 1, 5) + 3(1, —1, —1). The coordinate vector of w with respect to B\ 
is (5, — 1) T and the coordinate vector of iv with respect to B 2 is (2,3) T . 

4. If W is the subspace of V = containing all members of V whose components sum 
to 0, then W has dimension 4. In fact W = { (a,b,c,d,a + b + c+ d) \ a,b,c,d£ 2 2 }. 
One ordered basis for this space is ((1, 0, 0, 0, 1), (0, 1, 0, 0, 1), (0, 0, 1, 0, 1), (0, 0, 0, 1, 1)). 

5. Binary codes: More generally, consider the set of all binary n-tuples with an even 
number of Is; this is the linear binary code mentioned in Example 6, §6.1.2. These 
vectors form a subspace W of V = Z% of dimension n — 1. A basis for W consists of the 
following n— 1 vectors, each of which has exactly two Is: (1,0 ,..., 1 ), ( 0 , 1 , . . . , 1 ), . . . , 
(0, 0, ... , 1, 1). Consequently there are 2” -1 vectors in the code W. 

6. The field C of complex numbers is two-dimensional as a vector space over 77.; it has 
the ordered basis (1, i), where i = \/— 1. Any two complex numbers, neither of which is 
a real multiple of the other, form a basis. 

7. Both C and 77 are infinite-dimensional vector spaces over the rational field Q. 

8. The vector space F[x] is an infinite-dimensional space over F; (1, x, x 2 , x 3 , . . .) is 
an ordered basis. The subspace of all polynomials of degree < n has dimension n + 1; 
(1, x, x 2 , ... , x n ) is an ordered basis. 


6.1 .4 INNER PRODUCTS, LENGTH, AND ORTHOGONALITY 


By imposing additional structure on real and complex vector spaces, the concepts of 
length, distance, and orthogonality can be introduced. These concepts are motivated 
by the corresponding geometric notions for physical vectors. Also, for real vector spaces 
the geometric idea of angle can be formulated analytically. 
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Definitions: 

An inner product on a vector space V over TZ is a function (*, •): V x V — > TZ such 
that for all u, v, w £ V and a, b £ 1Z the following hold: 

• (u,v) = (v,u); 

• (u, u) > 0 with equality if and only if u = 0; 

• (cm + feu, w) = a(u, w) + b(v, w). 

An inner product on a vector space V over C is a function (-, •): V x V — » C such that 
for all u, v,w £ V and a,b £ C the following hold: 

• (u,v) = (u,it) (where bar denotes complex conjugation); 

• (u, u) > 0 with equality if and only if u = 0; 

• (au + feu, u>) = a(u, w) + b(v, w). 

Note: The first property implies that (it, it) is real, so the second property makes sense. 
An inner product space is a vector space over TZ or C on which an inner product is 
defined. Such a space is called a real or complex inner product space, depending on its 
scalar field. 

The norm ( length ) of a vector v £ V is ||u|| = \J ( v , v). 

A vector v £ V is a unit vector if and only if ||u|| = 1. 

The distance d(v,w ) from v to w is d(v,w) = ||u — u>||. 

In a real inner product space, the angle between nonzero vectors v and w is the real 

(v. W / 

number 0. 0 < 9 < i r, such that cos 9 = , M ’ , — r-. 

fII • HI 

Two vectors v and w are orthogonal if and only if (v, w) = 0. 

A subset S C b is an orthogonal set if (i>, w) = 0 for all v,w £ S with v ^ w. 

A subset S C V is an orthonormal set if S is an orthogonal set and ||u|| = 1 for all 
v £ S. 

If IT is a subspace of an inner product space V, then the orthogonal complement 
W 1 - = { v £ V | (v, w) = 0 for all w £ W }. 

Facts: 

1. Standard inner product on 7 Z n : The real- valued function defined by (x, y) = X\y\ + 
X 2 V 2 + ■ ■ ■ + x n y n is an inner product on V = 7 Z n . 

2. Standard inner product on C n : The complex-valued function defined by (x,y) = 
X\ y 1 + xgy 2 + • • • + x n y n is an inner product on V = C n . 

3. If A is an n x n real positive definite matrix (§6.3.2), then the function defined by 
(x,y) = x T Ay is an inner product on lZ n . (Here x T denotes the transpose of x.) 

4. If H is an n x n complex positive definite matrix (§6.3.2), then the function defined 
by ( x,y ) = y*Hx is an inner product on C n . (y* is the conjugate-transpose of y.) 

5. The function (/, g) = J Q b f(x)g(x)dx is an inner product on the vector space C[a,b] 
of continuous real- valued functions on the interval [a, fe]. 

6. The inner product (•,•) on an inner product space V is an inner product on any 
subspace IT of V. 

7. If IT is a subspace of an inner product space V, then the orthogonal complement IT^ 
is a subspace of V and V = IT ® IT -1 . 
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8. The norm function satisfies the following properties for all scalars a and all vectors 
v, w £ V: 

• \\v\\ > 0 with equality if and only if v = 0; 

• 1 1 an 1 1 = |a| • ||n||, where |a| denotes the absolute value of a ; 

• |(m,m;)| < |M| • ||u>| (Cauchy- Schwarz inequality)', 

• ||u + u;|| < ||u|| + ||w|| (triangle inequality ); 

• if v 0, then - — - v is a unit vector (the normalization of v). 

IMI 

9. The distance function on a vector space V satisfies the following properties for all 
v,w,z £ V: 

• d( v, w) > 0 with equality if and only if v = w; 

• d(v, w) = d(w, v); 

• d(v, z) < d(v, w) + d(w, z) (triangle inequality). 

10 . For real inner product spaces, two nonzero vectors are orthogonal if and only if the 
angle between them is 6 = 

11 . An orthogonal set S of nonzero vectors can be converted to an orthonormal set by 
normalizing each vector in S. 

12. An orthogonal set of nonzero vectors is independent. An orthonormal set is inde- 
pendent. 

13 . If V is an n-dimensional inner product space, any orthonormal set contains at 
most n vectors, and any orthonormal set of n vectors is a basis for V. 

14 . Every subspace W of an n-dimensional space V has an orthonormal (orthogonal) 
basis. 

15 . Gram-Schmidt orthogonalization: From any ordered basis (wi, W 2 , . ■ ■ , w m ) for 
a subspace W, an orthonormal basis (u±,U 2 , ■ ■ ■ ,u m ) for W can be constructed using 
Algorithm 1. (Jorgen Gram, 1850-1916; Erhardt Schmidt, 1876-1959) 


Algorithm 1: Gram-Schmidt orthogonalization process. 

input: an ordered basis (uq, w?, • • • , w m ) 
output: an orthonormal basis (tti, U 2 , • • ■ , u m ) 

Mi := — W \ , where ai := ||uq| 

Cli 


for j := 2 to to 
j~ i 


dd := Wd 


Ui := 


1 j 

~{ W 3 - Y^( W j’ Ui )Ui) 
^ i = 1 


16 . The standard basis is orthonormal with respect to the standard inner product. 

17. If (m, M 2 , . . . , u m ) is an orthonormal basis for a subspace W of V and w £ W, then 

w = (w, Ml) Mi + (w, m 2 )m 2 -I h (w, u m )u m . 
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18 . Projection vector: Let W be a subspace of a vector space V and let v be a vector 
in V. 

• There is a unique vector p £ W nearest to v; that is, the vector p minimizes 

|| u — u>|| over all w £ W. This vector p is called the projection of v onto W, 
written p = proj w (i>). 

• If (u\,u 2 , . . . , Um) is any orthonormal basis for W , then the projection of v onto W 

is given by proj w (u) = (v,ui)ui + {v,u 2 )u 2 H b (v,u m )u m . 

• The vector proj w (u) is the unique vector w £ W such that v — w is orthogonal 

to every vector in W. 

19. Projection matrix : If V = 7 Z n is equipped with the standard inner product and 
(ui,u 2 , . . . , u m ) is an orthonormal basis for a subspace W, then the projection of each 
x £ lZ n onto W is given by proj w (cc) = Ax, where A = GG T with G = (ui,u 2 , . . . , u m ) 
the n x m matrix with the Ui as columns. 

20 . The projection matrix A is symmetric and satisfies A 2 = A. 


Examples: 

Consider the vector space 7Z 4 with the standard inner product (x,y) = x T y, and let W 
be the subspace spanned by the three vectors w\ = (1,1, 1, 1) T , w 2 = (3,1,3, 1) T , 
wz = (3, 1, 1, 1) T . 

1 . (wi,w 2 ) = 8 and ||tui|| = 2 . 

2. The angle 9 between w\ and w 2 satisfies cos 9 = = A= (so 9 « 0.4636 radians). 

3. The distance from W\ to w 2 is d(w i, w 2 ) = ||u>i — w 2 \\ = || ( — 2, 0, —2, 0) T || = 2\/2. 

4. The orthogonal complement W 1 - of W is the set of vectors of the form (0, a, 0, —a). 

5. The Gram-Schmidt process applied to (wi, 1 x 2 , 103 ) yields: 

ui = 1 = ( 5 , \) T , where ai = ||wi|| = 2 ; 

u 2 = ±(w 2 - {. w 2 ,u-i)ui ) = ^((3,1,3,1) T -4(|, i, i, \) T ) 

= ^( 1 ;- 1 5 1 >- 1 ) T = (b~b b ~|) T ’ w here a 2 = ||( 1 , - 1 , 1 , - 1 ) T || = 2 ; 


U3 = - {W 3 ,ui)ui - ( W 3 ,U 2 )u 2 ) 

- J_ (pi I I IjT _ 3 fl I I l) T _ 1 (1 _1 1 

— a 3 °V2’2’2’2/ - l \2’ 2’2’ 2/ / 

= ^( 1 ’°’ _1 ’°) T = (72’ 0 ’~72’°) T ’ where “ 3 = IK 1 ’ 0 ’ -1 ’ 0 ) 7 !! = v/ 2- 


6 . The vector in W that is nearest to v = (3,6,3,4) T is p = proj w (u) = {v,u\)u\ + 
(v, u 2 )u 2 + ( v , 113)113 = 8 «i + (—2 )u 2 + 0u 3 = (3, 5, 3, 5) t . Further, v — p= (0, 1, 0, — 1) T 
is orthogonal to every vector in W, and if 114 = (0, ,0, — is the normalization of 

v — p, then {u\,u 2 , 113 , U 4 ) is an orthonormal basis for 7 Z 4 . 


7. The projection of any x £ 7 Z 4 onto W is given by pro) w (x) = Ax, where 

(\ 0 0 O' 

A = GG t = (u 1 ,u 2 ,u 3 )(u 1 ,u 2 ,U3) t = 


0 2 


0 I 


0 

Vo 


0 10 

1 0 i 

2 2 

Thus, if x = (3,6,3,4) t , its projection onto W is computed as Ax = (3,5, 3, 5) T , 
consistent with the answer found in Example 6 . 
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6.2 LINEAR TRANSFORMATIONS 

Linear transformations are special types of functions that map one vector space to 
another. They are called “linear” because of their effect on the lines of a vector space, 
where by a “line” is meant a set of vectors w of the form w = au + v where u ^ 0 and v 
are fixed vectors in the space and a varies over all values in the scalar held. Linear 
transformations carry lines in one vector space to lines or points in the other. 


6.2.1 LINEAR TRANSFORMATIONS, RANGE, AND KERNEL 
Definitions: 

Let V and W be vector spaces over the same held F. A linear transformation is a 
function T:V — > W satisfying T(au + v) = aT(u) + T( v) for all u, v € V and a £ F. 

The range Rt of a linear transformation T is Rt = { T(v) | v £ V }. 

The kernel ker T of a linear transformation T is kerT = { v £ V \ T(v) =0}. 

The rank of T is the dimension of Rt- (Rt is a subspace of W by Fact 5.) 

The nullity of T is the dimension of kerT. (kerT is a subspace of V by Fact 5.) 

A linear operator on V is a linear transformation from V to V. 

Facts: 

1. For any vector spaces V and W over F, the zero function Z: V — > W defined by 
Z(v) = 0 for all v £ V is a linear transformation from V to W. 

2. For any vector space V over F, the identity function I: V — > V defined by I(v) = v 
for all v £ V is a linear operator on V . 

3 . The following four statements are equivalent for a function T: V — > W: 

• T is a linear transformation; 

• T(u + v) = T(u ) + T(v) and T(au) = aT(u) for all u,v £ V and a £ F; 

• T(au + bv) = aT(u) + bT(v ) for all u,v £ V and a, b £ F\ 

• T(JA =1 arVi) = X]j=i aiT( Vi ) for all finite subsets {v\,V 2 , ■ ■ ■ , v t } C V and scalars 

di £ F . 

4 . If T: V — > W is a linear transformation, then: 

. T(0) = 0; 

• T(—v) = —T(v) for all v £ V; 

• T(u — v) = T(u) — T(v) for all u, v £ V. 

5 . If T: V —> W is a linear transformation, then Rt is a subspace of W and kerT is a 
subspace of V. 

6 . If T: V — > W is a linear transformation, then the rank of T plus the nullity of T 
equals the dimension of its domain: dimI?T + dim (kerT) = dimVL 

7 . If T: V — > W is a linear transformation and if the vectors {vi,V 2 , ■ ■ ■ span V, 
then {T(vi),T(v 2 ), ■ ■ ■ ,T(v n )} span R t . 

8 . If T: V — > W is a linear transformation, then T is completely determined by its 
action on a basis for V. That is, if B is a basis for V and / is any function from B to W, 
then there exists a unique linear transformation T such that T( v) = f(v) for all v £ B. 
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9. A linear transformation T: V — » W is one-to-one if and only if kerT = {0}. 

10. A linear transformation T: V — > W is onto if and only if for every basis B of V, 
the set { T(v) \v £ B} spans W. 

11. A linear transformation T: V —> W is onto if and only if for some basis B of V, the 
set { T(v) | v £ B } spans W. 

12. If T:V W is a bijective linear transformation, then its inverse T _1 : W — » V is 
also a bijective linear transformation. 

13 . For each fixed m x n matrix A over F, the function T: F nxl — > F mxl defined by 
T{x) = Ax is a linear transformation. 

14 . Every linear transformation F:F nxl — » F mxl has the form T(x) = Ax for some 
unique in x n matrix A over F. 

15 . The range Rt of the linear transformation T{x) = Ax is equal to the column space 
of A, and kerT is equal to the null space of A. (See §6.1.2, §6.1.3.) 

16 . If T is a linear transformation from V to W and if T(vo) = Wo € Rt, then the 
solution set S to the equation T(v) = wg is S = { vq + u \ u € ker T }. 


Examples: 

1 . The function T : 7Z 2 x 1 


Fi 


2x1 


given by T 


transformation. It has the form T(x) = Ax, where A = 


X\ — 3^2 

— 2a; i + Qx 2 
1 -3 
-2 6 


is a linear 
. The kernel of T 


is { (3a, ay \ a € 1Z } and the range of T is { (b, —2b) 1 \ b £lZ}. 


2 . For each fixed matrix A £ F nxn the function T: F nxn — » F nxn defined by T{X) = 
AX — XA is a linear transformation whose kernel is the set of matrices commuting 

/ i _V 

with A. Specifically, let n = 2, F = 1Z, and A = 


and by computation T 


1 0 
0 0 


T 


2 -5 
0 -2 


0 3 
-2 0 / 

dim Rt > 2. Since both the identity matrix I and A itself are in kerT, dim (kerT) > 2. 
By Fact 6, it follows that dim TV = 2 and dim (kerT) = 2. Therefore (I, A) forms a 


-2 6 

0 1 

0 0 


. Then dim7?. 2x2 = 4, 
Thus, 


basis for ker T, and the matrices 
Fact 16, the solutions to T(x) = 
b 


1 0 
0 0 


1 0 
0 1 


0 3 
—2 0 
0 3 
-2 0 
1 -3' 

-2 6 


and 


2 -5 
0 -2 


are a basis for Rt . From 
are precisely the set of matrices of the form 
with a,b £lZ. 


3 . The function E(x\, X^, %3, £ 4 ) = (X\,X2,X3,X4,Xi + X 3 + X4 : ,Xi + X 2 + X4,Xi+X2 + X 3 ), 
where x $ £ Z 2 , is a linear transformation important in coding theory. It represents an 
“encoding” of 4-bit binary vectors into 7-bit binary vectors ( “codewords” ) before being 
sent over a “noisy” channel (§14.2). The kernel of the transformation consists of only 
the zero vector 0 = (0,0, 0,0), and so the transformation is one-to-one. The collection 
of codewords (that is, the range of E), is a 16-member, 4-dimensional subspace of Zj 
having the special property that any two of its distinct members differ in at least three 
components. This means that if, during transmission of a codeword, an error is made in 
any single one of its components, then the error can be detected and corrected as there 
will be a unique codeword that differs from the received vector in a single component. 
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4 . Continuing with Example 3, the linear transformation D(z±, Z2, Z3, z&, Z5, z$, Z7) — 
(24 + Z3 +Z4 +Z5, 2:1+2:2+2:4+2:61 Z1+Z2+Z3+Z7) is used in decoding the (binary) received 
vector z. This transformation has the special property that its kernel is precisely the 
set of codewords defined in Example 3. Thus, if D(z) ^ 0, then a transmission error 
has been made. 

5 . For C as a vector space over 1Z and any 2:0 € C, the function T:C — > C defined by 
T{z) = 2:02: is a linear operator; in particular, if 2:0 = cos 6 + i sin 9, then T is a rotation 
by the angle 9. ( T(z ) is also a linear operator on C as a vector space over itself.) 

6 . For any fixed real- valued continuous function g on the interval [a, 6], the function T 
from the space C[a, 6] of continuous functions on [a, b] to the space D[a,b\ of contin- 
uously differentiable functions on [a, 6] given by T(/)( x) = f* g(t)f(t)dt is a linear 
transformation. 

7. For the vector space V of functions p: 1Z — » TZ with continuous derivatives of all 
orders, the mapping T: V — > V defined by T(p) = p" — 3 p' + 2 p (where p' and p" are the 
first and second derivatives of p) is a linear transformation. Its kernel is the solution set 
to the homogeneous differential equation p" — 3 p' + 2p = 0: namely, p(x) = Ae x + Be 2x , 
where A.B&1Z. Since T(x 2 ) = 2— 6a;+2x 2 , the set of all solutions to T{p) = 2— 6x+2x 2 
is x 2 + Ae x + Be 2x (by Fact 16). 

8 . If vq is a fixed vector in a real inner product space V, then T: V — > 7Z given by 
T{v) = (v,vo) is a linear transformation. 

9 . For W a subspace of the inner product space V, the projection proj w of V onto W 
is a linear transformation. (See §6.1.4.) 


6.2.2 VECTOR SPACES OF LINEAR TRANSFORMATIONS 
Definitions: 

If S and T are linear transformations from V to W, the sum ( addition ) of S and T is 
the function S + T defined by ( S + T)(v) = S(v) + T(v) for all v € V. 

If T is a linear transformation from V to W, the scalar product ( scalar multiplica- 
tion) of a € F by T is the function aT defined by ( aT)[v ) = aT{v ) for all v G V. 

If T: V — » W and S:W — U are linear transformations, then the product ( multipli- 
cation , composition ) of S and T is the function SoT defined by (SoT)(v) = S(T(v)). 
Note: Some writers use the notation vT to denote the image of v under the transfor- 
mation T, in which case T o S is used instead of S o T to denote the product; that is, 
v(T o S) = ( vT)S . 

Facts: 

1 . The sum of two linear transformations from V to W is a linear transformation from 
V to W. 

2 . The product of a scalar and a linear transformation is a linear transformation. 

3 . If T: V — > W and S:W — ► U are linear transformations, then their product S o T is 
a linear transformation from V to U. 

4 . The set of linear transformations from V to W with the operations of addition and 
scalar multiplication forms a vector space over F. This vector space is denoted L(V, W). 
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5. The set L(V, V ) of linear operators on V with the operations of addition, scalar mul- 
tiplication, and multiplication forms an algebra with identity over F. Namely, L(V , V) 
is a vector space over F and is a ring with identity under the addition and multiplication 
operations. In addition, a(S o T) = ( aS) o T = S o ( aT ) holds for all scalars a £ F and 
all S,T £ L(V, V). The identity mapping is the multiplicative identity of the algebra. 

6. If dim V = n and dim W = to, then dimL(C, W) = nm. 

Examples: 

1 . Consider L(F nxl , F mxl ) . If T and S are in L(F nxl , F mxl ), then T(x) = Ax and 
S(x) = Bx for unique to x n matrices A and B over F. Then (T + S)(x) = (A + B)x, 
(aT)(x) = a Ax, and in case m = n, (T o S){x) = ABx. 

2. Let V = C[a, b] be the space of real- valued continuous functions on the interval [a, b\, 
and let T and S be linear operators defined by T(f)(x) = ff e~ l f{t)dt and S(f)(x) = 
fa e*f{t)dt. Then (T + S)(f)(x) = /“(e -4 + e 4 )/(t)cif, {cT){f){x) = ff ce _t /(t)dt, and 
(T o S)(f)( x) = f x a f* e s ~ t f{s)dsdt. 

3. Let V be the real vector space of all functions p: 1Z — > 1Z with continuous derivatives 
of all orders, and let D be the derivative function. Then D: V — » V is a linear operator 
on V and so is a function such as T = D 2 — 3 D + 21 where D 2 = D o D and I is the 
identity operator on V. The action of T on p £ V is given by T(p) = p" — 3 p' + 2 p. 


6.2.3 MATRICES OF LINEAR TRANSFORMATIONS 
Definitions: 

If T: V — > W is a linear transformation where dim V = n, dim W = to, and if B = 
(i’i,V 2 , ■ ■ • , v n ) and B' = (v[, v ' 2 , . . . , v' m ) are ordered bases for V and W, respectively, 
then the matrix of T with respect to B and B' is the to x n matrix [T}b,b' whose 
jth column is \T{vj)\s', the coordinate vector (§6.1.3) of T(vj) with respect to B' . 

If T: V — > V is a linear operator on V . then the matrix of T with respect to B is 
the n x n matrix [T]s.b denoted simply as [T]g. 

Facts: 

Assume that T and S are linear transformations from V to W, B and B' are respective 
bases for V and W, and A and B are the matrices defined by A = [T]g g< and B = 

1. [T(v)\b' = [T]b,B'[v\b for all v £ V ; that is, if y = [T(v))b' and x = [v)b, then 
y = Ax. 

2. kerT = {x\V\ + X 2 V 2 + ■ ■ ■ + x n v n \ (x\, x 2 , ■ ■ ■ , x n ) T £ NS(A) }, where B = 
(vi,v 2 , ■ ■ .,v n ). 

3. T is one-to-one if and only if NS (A) = {0}. 

4. R t = { yiv[ + y 2 v' 2 + ••• + y m v' m \ (yi, 2/2, • • • , y m ) T £ CS(A) }, where B' = 
W 1 ,v' 2 ,...,v' rn ). 

5. T is onto if and only if CS{A) = F mxl . 

6. T is bijective if and only if to = n and A is invertible. In this case, [T -1 ]gyg = A^ 1 . 
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7 . [T + S]b,B' = A + B, [ aT]s,e 1 = aA for all a € F, and the mapping / from L(V , , W) 
to F mxn defined by f(T) = [' T]b,b ' is an isomorphism. 

8. If U is a vector space over F, B" is a basis for U, and R:W — > U is a linear 

transformation, then [I? o = CA where C = that is, [I? o T\ b ,b" = 

[R]b',B" [T]b,b'- 

9 . The algebra L(V,V) is isomorphic to the matrix algebra F nxn . 

10 . If /: V — * V is the identity mapping, then [I]b,b = [I]b equals the identity matrix 
for any basis B. 

11 . If A is an m x n matrix over F with B and B' being arbitrary bases for V and W, 
respectively, then there exists a unique linear transformation T: V — > W such that 

A=[T\ BtB >. 

12 . Linear transformations are used extensively in computer graphics. (See Exam- 
ple 5.) Further information can be found in [PoGe89]. 


Examples: 

1. Consider T:1Z 2x1 


— 3x 2 
— 2x\ + 6x2 


and the bases 


R 2x1 given by T 

V^/ 

B = (vi,i> 2 ) and B' = (ui,t4), where V\ = (1,0) T , V 2 = (0, 1) T and v[ = (1, 1) T , 
v' 2 = (2, 1) T . Since 

T(u 1 ) = (1,-2) t = (-5K + 3^, 

T(u 2 ) = (-3,6) t = 15u! + (— 9)u', 

it follows that [T(v i)]g/ = (—5, 3) T and [ T[v 2 )]b ' = (15, — 9) T ; hence, the matrix of T 


relative to B and B’ is \T)b,B' = 
and [T] B / )B / = [T\ b > = f 


-5 15 

3 -9 


. Similarly, [T] b ,b = [T\b = 


1 -3 
-2 6 


-5 15 

3 -9 


Since NS(A) = 


2. Consider T of Example l where A = [T\b.b> = 

{ (3a, a) T | a G R } and CS(A) = { (—56, 36) T | 6 £ R }, Fact 2 gives ker T = { 3avi + 
av 2 = (3a, a) T | a G 1Z } and Fact 4 gives Rt = { (— 5b)v[ + 36^2 = (6, — 26) T | 6 G 1Z }. 
T is not one-to-one since NS (A) ^ {0} and is not onto since CS(A) ^ lZ 2xl . (Any one 
of the three matrices found in Example l could have been used to determine ker T and 
Rt and to reach these same conclusions.) 

3. Consider the linear operator on R 2x2 defined by T(X) = AX — XA where A = 
1 — 3 \ 

, and let B = (En,Ei 2 , E 2 i,E 2 2 ) be the standard basis. (Here, E t j has a l 
in position (i,j) and Os elsewhere.) Then 

0 3 s 


-2 


T{^E\\^ — AE\\ — E\\A — 


-2 0 


— OE/n + 3 Ei 2 + (— 2)E 2 i + 0E 2 2, 


so (0, 3, —2, 0) T is the first column of [T\ B . Similar calculations yield 


[T\b = 


/ 0 2-3 O' 

3-5 0-3 

-2 0 5 2 

V 0 -2 3 0, 


The null space of this 4x4 matrix is { (5a + 6, 3a, 2a, b) T \ a,b G 7^}, so that those 
matrices X commuting with A (that is, in kerT) have the form X = f ^ ^ 
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4. Consider C as a vector space over 1Z and the rotation operator of §6.2.1 Example 5; 
namely, T(z ) = zqz where zo = cosO + isind. If B is the standard basis, B = (1,1), then 

the matrix of T relative to B is [Tig = ( C0S f S * n ^ ) . 

L J (sinfl cos OJ 

5. Computer graphics: The polygon in part (a) of the following figure can be rotated 
by applying the transformation T in Example 4 to its vertices (-2,-2), (1,-1), (2,1), 
(—1,3). The matrix of vertex coordinates is 

x =(-l -5 l 1 )- 

For a rotation of ? , the matrix of T is 


. / 0.732 1.366 0.134 -3.098 \ 

^-2.732 0.366 2.232 0.634 J’ 

giving the rotated polygon shown in part (b) of the following figure. To perform a 
“zoom in” operation, the original polygon can be rescaled by 50% by applying the 
„ { X\ /l.5x\ .. r , 


transformation S 


\yj 

basis is D = ( ^ 

y 0 1.5 

-3 1.5 3 —1.5 "N 

-3 -1.5 1.5 4.5 J 


1-5 y ' 


Since the matrix for S relative to the standard 


the vertex coordinates X are transformed into DX = 


see part (c) of the figure. Reflection through the ®-axis 


would involve the transformation R 


, represented by the diagonal ma- 


trix C = ^ q J . In computer graphics, the vertices of an object are actually given 

( x , y, z) coordinates and three-dimensional versions of the above transformations can 
be applied to move and reshape the object as well as render the scene when the user’s 
viewpoint is changed. 
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6.2.4 CHANGE OF BASIS 


Definitions: 

Let B = {v\,V 2 , ■ ■ ■ , v n ) and B' = (v[, v 2 , ■ ■ ■ , v’ n ) be two ordered bases for V, and let I 
denote the identity mapping from V to V. The matrix P = \I\b,B' is the transition 
matrix from B to B'. It is also called the change of basis matrix from basis B to 
basis £>'. 

If A and B are two x n matrices over a field F, then B is similar to A if there exists 
an invertible n x n matrix P over F such that P~ 1 BP = A. 


Facts: 

1. The transition matrix P = [I]b,B’ is invertible; its inverse is P~ 1 = [I\b',B- 

2. If x = [v\b and y = [v}b>, then y = Px where P = \I\b,B'- 

3. When B = B ' , the transition matrix P = [I]b,b = [I]b is the n x n identity matrix. 

4. If T is a linear operator on V with A and B the matrices of T relative to bases B 
and B ' , respectively, then B is similar to A. Specifically, P~ 1 BP = A where P = [I\b,B'- 

5. If A and B are similar n x n matrices, then A and B represent the same linear 
operator T relative to suitably chosen bases. More specifically, suppose P~ 1 BP = A, 
B = (vi,V 2 , ■ ■ ■ ,v n ) is any basis for V, and T is the unique linear transformation with 
A = [T]g. Then B = [T]g/ where B' = (u^, v' 2 , ■ ■ ■ , v' n ) is the basis for V given by 


Examples: 

1. Consider the 1Z 2xl bases B = (vi,V 2 ) and B’ = where iq = (1,0) T , Vi = 

(0, 1) T and v[ = (1,1) T , v' 2 = (2, 1) T . Since v± = (— l)v[ + v' 2 and V 2 = + (— 1)?4, 

the transition matrix from B to B' is P = [I\b.B' = ^ | and its inverse P ~ 1 = 

^ is the transition matrix [I\b’.b- If v = X\V\ + x 2 i >2 where Xi € 1Z, then by 
Fact 2, v = y\v'i + y 2 V 2 where y\ = (— l)#i + 2x2 and 2/2 = x\ + (— l)x 2 - 

2. Consider T : P? x 1 — > lZ 2xl given by ^ 2a’ 1 + 6c 2 )’ an< ^ same 

bases B and B' specified in Example 1. The matrix of T with respect to B is [T]b = 

and the matrix of T with respect to B' is [T]b' = B = 

Moreover, A and B are similar; indeed, as Fact 4 shows, A = P~ 1 BP where P = 
^ is determined in Example 1. 






6.3 MATRIX ALGEBRA 

Matrices naturally arise in the analysis of linear systems and in representing discrete 
structures. This section studies important types of matrices, their properties, and meth- 
ods for efficient matrix computation. 
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6.3.1 BASIC CONCEPTS AND SPECIAL MATRICES 


Definitions: 

The m x n matrix A = (a© is a rectangular array of mn real or complex numbers (ijj , 
arranged into m rows and n columns. 

The ith row of A , denoted A(i , :), is the array an ai 2 • • ■ am- The elements in the *th 
row can be regarded as a row vector (an, a,; 2 , . . . , ai„) in lZ n or C n . The jth column 
of A, denoted A(:,j), is the array 

°i 3 

a 2j 


CL-mj 

which can be identified with the column vector (aij, a 2 j, ■ ■ ■ , a m j) T (where the expo- 
nent T indicates the transpose). 

A matrix is sparse if it has relatively few nonzero entries. 

A submatrix of the matrix A contains the elements occurring in rows i\ < i 2 < • • ■ < ik 
and columns ji < j 2 < ■ ■ ■ < jr of A. A principal submatrix of the matrix A contains 
the elements occurring in rows i\ < i 2 < • ■ ■ < ik and columns %\ < i 2 < ■ ■ ■ < ik of A. 
This principal submatrix has order k and is written A[i\, i 2 , . . . , ik]. 

Two matrices A and B are equal if they are both m x n matrices with a ZJ = bij for all 
i = 1,2, ... ,m and j = 1, 2, . . . , n. 

The transpose of the m x n matrix A = (atj) is the n x m matrix A T = (bij) in which 

bij — aji- 

The Hermit ian adjoint of the m x n matrix A = ( aij ) is the n x m matrix A* = (bij) 
in which bij is the complex conjugate of aji. 

If m = n, the matrix A = (o^) is square with diagonal elements an, a 22l . . . , a nn - 
The main diagonal contains the diagonal elements of A. An off-diagonal element is 
any Oy with i / j. The trace of A, tr A, is the sum of the diagonal elements of A. 

Table 1 defines special types of square matrices. 

Facts: 

1. Triangular matrices arise in the solution of systems of linear equations (§6.4). 

2. A tridiagonal matrix can be represented as follows, where the diagonal lines represent 
the (possibly) nonzero entries. 



3. Tridiagonal matrices are particular types of sparse matrices. Such matrices arise in 
discretized versions of continuous problems, the solution of difference equations (§3.3, 
§3.4.4), and the solution of eigenvalue problems (§6.5). 
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Table 1 Special types of square matrices. 


matrix 

definition 

identity 

I n = ( e ij) where e i3 = q |f \ ^ j ( n x n ma t r i x ! each 


diagonal entry is 1; each off-diagonal entry is 0) 

diagonal 

D = ( dij ) where g© = 0 if i ^ j (nonzero entries occur only 


on the main diagonal) 

lower triangular 

L = (lij) where lij = 0 if j > i (nonzero entries occur only on 


or below the diagonal) 

upper triangular 

U = ( uij ) where Uij = 0 if j < i (nonzero entries occur only 


on or above the diagonal) 

unit triangular 

triangular matrix with all diagonal entries 1 

tridiagonal 

A = (a© where a,j = 0 if \i — j\ > 1 (nonzero entries occur 


only on or immediately above or below the diagonal) 

symmetric 

real matrix A for which A = A T 

skew-symmetric 

real matrix A for which A = —A T 

Hermitian 

complex matrix A for which A = A* 

skew-Hermitian 

complex matrix A for which A = — A* 


4. Sparse matrices frequently arise in the solution of large systems of linear equations 
(§6.4), since in many physical models a given variable typically interacts with relatively 
few others. Linear systems derived from sparse matrices require less storage space and 
can be solved more efficiently than those derived from a “dense” matrix. 

5. Forming the transpose of a square matrix corresponds to “reflecting” the matrix 
elements with respect to the main diagonal. 

6. Any skew-symmetric matrix A must have an = 0 for all i. 

7. Any Hermitian matrix A must have an real for all i. 

8. If A is real then A* = A T . 

9 . The columns of the identity matrix I n are the standard basis vectors for 7 Z n (§6.1.3). 

10. Viewed as a linear transformation (§6.2), the identity matrix represents the identity 
transformation; that is, it leaves all vectors unchanged. 

11. Viewed as linear transformations, diagonal matrices with positive diagonal entries 
leave the directions of the basis vectors unchanged, but alter the relative scale of the 
basis vectors. 

Examples: 

1. The 2x2 and 3x3 identity matrices are A = ^ 

(6 0 l\ 

2. The matrix A= 0 2 4 I is symmetric. 

U 4 3/ 

is Hermitian. 


3. The matrix A = 


1 2-3 i 

2 + 3 i -4 


\ 1 

0 

°\ 

) and J 3 = I 0 

1 

0 

' Vo 

0 

1 / 
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4. A 2 x 2 diagonal matrix transforms the unit square in 1Z 2 into a rectangle with sides 
parallel to the coordinate axes. The following figure shows the effect of the diagonal 

matrix ^ ^ ^ cm certain vectors and on the unit square in 1Z 2 . The standard basis 

vectors {(1, 0) T , (0, 1) T } have been transformed to {(3, 0) T , (0, 2) T }. 


(0,2) 

i 


(1.5,2) 

(3,2) 



(0,1) 

(03 J) 


(3,1) 


fu) 

n',o.5) 


(1.0) D 

,0) 


5. A 3 x 3 diagonal matrix transforms the unit cube into a rectangular parallelepiped. 

6. The standard basis vectors are all eigenvectors of a diagonal matrix with the corre- 
sponding diagonal elements as their associated eigenvalues (§6.5). 


6.3.2 OPERATIONS OF MATRIX ALGEBRA 
Definitions: 

The scalar product ( dot product ) of real vectors x = (aq, aq, ■ ■ ■ , x n ) and y = 
(yi, 2 / 2 , • • • , y n ) is the number x-y = Yn=i x dh- 

The n x n matrix A is nonsingular ( invertible ) if there exists an n x n matrix A -1 
such that AA _1 = A~ 1 A = I. Any such matrix A -1 is an inverse of A. 

An orthogonal matrix is a real square matrix A such that A T A = I. 

A unitary matrix is a complex square matrix A such that A* A = I, where A* is the 
Hermitian adjoint of A (§6.3.1). 

A positive definite matrix is a real symmetric (or complex Hermitian) matrix A such 
that x* Ax > 0 for all x yf 0. 

The nonnegative powers of a square matrix A are given by A 0 = J, A” = AA n_1 . If 
A is nonsingular then A~ n = (A -1 )™. 

The following table defines various operations defined on matrices A = (a© and B = 
( b,j ) . (See Facts 1, 2, 5, 6 for restrictions on the sizes of the matrices.) 


operation 

definition 

sum A + B 

difference A — B 
scalar multiple aA 
product AB 

A + B = ( Cij ) where Cij = + bij 

A — B = ( ) where c^ = — bij 

a A = (cjj) where Cy = aaij 

AB = (dj) where Cy = Ylk a ikhj 
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Facts: 


1. Matrices of different dimensions cannot be added or subtracted. 

2. Square matrices of the same dimension can be multiplied. 

3. Real or complex matrix addition satisfies the following properties: 

• commutative: A + B = B + A; 

• associative : A + (B + C) = (A + B) + C, A{BC) = (. AB)C; 

• distributive: A(B + C) = AB + AC, ( A + B)C = AC + BC\ 

• a(A + B) = aA + aB, a(AB) = ( aA)B = A(aB) for all scalars a. 

4. Matrix multiplication is not, in general, commutative — even when both products 
are defined. (See Example 3.) 

5. The product AB is defined if and only if the number of columns of A equals the 
number of rows of B. That is, A must be an m x n matrix and B must be an n x p 
matrix. 

6. The ij th element of the product C = AB is the scalar product of row i of A and 
column j of B: 



/th column 



7. Multiplication by identity matrices of the appropriate dimension leaves a matrix 
unchanged: if A is rn x n, then I m A = AI n = A. 

8. Multiplication by diagonal matrices has the effect of scaling the rows or columns of 
a matrix. Pre-multiplication by a diagonal matrix scales the rows: 


/ 

0 • 

' ° ^ 


/ an 

• Alp \ 


/ duaii 

dudlp \ 

0 

d22 • 

• 0 


fl21 

fl2p 

= 

G?22fl21 

^22«2p 

\ 0 

0 • 

dnn ) 


^ flnl 

flnp ^ 


^ dn n A n 1 

^nnflnp 


Post-multiplication by a diagonal matrix scales the columns: 


/ All 

^1 n 


( 

0 • 

• ° ^ 


/ ^nflii 


021 

^2 n 


0 

d22 • 

• 0 



rfllfl21 

^nn®2n 

V flml 

Q"mn ) 


V o 

0 • 

dnn ) 


\dna m i 

dnn^mn / 


9. Any Hermitian matrix can be expressed as A + iB where A is symmetric and B is 
skew-symmetric. 

10. The inverse of a (nonsingular) matrix is unique. 

11. If A is nonsingular, the solution of the system of linear equations (§6.4) Ax = b is 
given by (but almost never computed by) x = A~ x b. 


© 2000 by CRC Press LLC 



12 . The product of nonsingular matrices A and B is nonsingular, with ( AB ) _1 = 
B~ 1 A~ 1 . Conversely, if A and B are square matrices with AB nonsingular, then A 
and B are nonsingular. 

13 . For a nonsingular matrix regarded as a linear transformation (§6.2), the inverse 
matrix represents the inverse transformation. 

14 . Sums of lower (upper) triangular matrices are lower (upper) triangular. 

15 . Products of lower (upper) triangular matrices are lower (upper) triangular. 

16 . A triangular matrix A is nonsingular if and only if an ^ 0 for all i. 

17 . If a lower (upper) triangular matrix is nonsingular then its inverse is lower (upper) 
triangular. 

18 . Properties of transpose: 

• ( A T ) T = A; 

• ( A + B) t = A t + B t \ 

• ( AB) t = b t a t - 

• AA T and A T A are symmetric; 

• if A is nonsingular then so is A T \ moreover oo- 1 = (a~ i ) t . 

19 . Properties of Hermitian adjoint: 

• (A*)* = A; 

• (A + B)* = A* + B*; 

• (AB)* = B*A*; 

• AA* and A* A are Hermitian; 

• if A is nonsingular, then so is A*; moreover (A*) -1 = (A -1 )*. 

20. If A is orthogonal, then A is nonsingular and A -1 = A T . 

21 . The rows (columns) of an orthogonal matrix are orthonormal with respect to the 
standard inner product on 72" (§6.1.4). 

22. Products of orthogonal matrices are orthogonal. 

23 . If A is unitary, then A is nonsingular and A -1 = A*. 

24 . The rows (columns) of a unitary matrix are orthonormal with respect to the stan- 
dard inner product on C n (§6.1.4). 

25 . Products of unitary matrices are unitary. 

26 . Positive definite matrices are nonsingular. 

27 . All eigenvalues (§6.5) of a positive definite matrix are positive. 

28 . Powers of a positive definite matrix are positive definite. 

29 . If A is skew-symmetric, then I + A is positive definite. 

30 . If A is nonsingular, then A T A is positive definite. 


Examples: 

1. Let A = 


12 3 


A — B = 


4 5 6 
-6 -6 -6 
4 4 4 


and B = 



Then A + B 


(8 10 

^4 6 



and 


2 . The scalar product of the vectors a = (1, 0, —1) and b = (4, 3, 2) is a ■ b = (1)(4) + 
(0)(3) + (-l)(2) = 2. 
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3. Let A = 


i i r 


, and C = 


defined with AB = 


1 0\ 2 3 


0 2 3 y ’ AB and BA are both 

) , whereas BA= ( 2 (\ ^ = 


. Also, AC is defined but CA is not defined. 


4. The matrices A , B of Example 1 cannot be multiplied since A has 3 columns and 
B has 2 rows; see Fact 5. However, all the products A T B , AB T , B T A, BA T exist: 


l 4 

A t B =25 


7 8 9 
0 12 


/ 7 12 17 \ 

= 14 21 28 , AB t = 

V 21 30 39 


12 3 
4 5 6 


7 0 \ 

8 1 = 
9 2/ 


( 122 BTA = f 12 21 30 V BA T = 

1/7 V 17 28 39 / 

A t H, as guaranteed by Fact 18. 


BA t = ^17^' H°l e ( B T A) T = 


5. Multiplication by a diagonal matrix: 


12 3 
4 5 6 


3 6 9 

8 10 12 


12 3 
4 5 6 


2 0 0 \ 

0 3 0 = 

0 0 1 / 


2 6 3 

8 15 6 


6. The 2x2 matrix A = 


is nonsingular if A = ad — be ^ 0; in this case 


A" 1 = 


d - b 
-c a 


A \—c a J 

7. The matrix A = h 


8. If A = \ 


7 —4 4 I is orthogonal. 

-4 18/ 

-i — 1 + A / 

1 1 + i I then A* = \ I 


\l+i -1+i 0 / 

Since A* A = I the matrix A is unitary. 


1 —i 1 — i 

i 1 — 1 — i 

—1 — i 1 — i 0 


9. Every 2x2 orthogonal matrix Q can be written as Q = I C ° S „ S ^ n ,, ) for some 

V Sln v cos v / 

real 9. Geometrically, the matrix Q effects a counterclockwise rotation by the angle 0. 
„ ,, , . „ . „ . . ( cos 2 9 - sin 2 9 — 2 sin 0 cos 0 \ 


for some 


10. For the matrix Q in Example 9, Q 2 = 


this must be the same as a rotation by an angle of 29, then Q 2 = 


2 sin 9 cos 9 cos 2 9 — sin 2 9 J ' 

;1 „ 9 /cos 29 —sin 20 


. Since 


cos 29 I ' 


Equating these two expressions for Q 2 gives the double angle formulas of trigonometry. 
/ 4 2 i -3 + A 

11 . The matrix —2 i —8 6 + 3i is Hermitian. It can be written as A + 


2 i 

-3 + © 

-8 

6 + 3i 

i 6 — 3/ 

5 y 

( o 

2i i 

+ -2 * 

0 

3i 

V -< 

—3/ 

0 


/ 4 0 -3 \ / 0 2f i \ 

Bi = 0 —8 6 + — 2i 0 3 i where A is symmetric and B is skew- 

3 6 5 / \ -i -3i 0 / 

symmetric. (See Fact 9.) 
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Algorithm 1 : Basic matrix multiplication. 

input: in x n matrix A, n x p matrix B 
output: to x p matrix C = AB 

for i := 1 to to do 
for j := 1 to p do 

C(i,j) := 0 

for fc := 1 to n do 

:= C(i,j) + A(i,k)B(k,j) 


6.3.3 FAST MULTIPLICATION OF MATRICES 

A variety of methods have been devised to multiply matrices more efficiently than by 
simply using the definition in §6.3.2. This section presents alternative methods for 
carrying out matrix multiplication. 


Definitions: 

The shift left operation shL(A(z, :), k) rotates elements of row i in matrix A exactly k 
places to the left, where data shifted off the left side of the matrix are wrapped around 
to the right side. 

The shift up operation shU (B(:,j),k) rotates elements of column j in matrix B 
exactly k places up, where data shifted off the top of the matrix are wrapped around 
to the bottom. 

These operations can also be applied simultaneously to every row of A or every column 
of B , denoted shL(A, k) and shU(f?, k) respectively. 

Facts: 

1. The basic definition given in §6.3.2 can be used to multiply the to x n matrix A 
and the n x p matrix B. The associated algorithm (Algorithm 1) requires 0(mnp) 
operations (additions and multiplications of individual elements). 

2. Matrix multiplication in scalar product form : Algorithm 1 can be rewritten in terms 
of the scalar product operation, giving Algorithm 2. 

3. Algorithm 2 is well-suited for fast multiplication on computers designed for efficient 
scalar product operations. It requires 0(mp) scalar products. 

4. Matrix multiplication in linear combination form: Algorithm 3 carries out matrix 
multiplication by taking a linear combination of columns of A to obtain each column of 
the product. 

5. The inner loop of Algorithm 3 performs a “vector + scalar x vector” operation, 
well-suited to a vector computer using efficiently pipelined arithmetic processing. 
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Algorithm 2: Scalar product form of matrix multiplication. 

input: m x n matrix A, n x p matrix B 
output: to x p matrix C = AB 

for i := 1 to to do 
for j := 1 to p do 

C(i,j) := A(i , :) • B(:,j) 


Algorithm 3: Column linear combination form of matrix multiplication. 

input: m x n matrix A, n x p matrix B 
output: to x p matrix C = AB 

for 7 := 1 to p do 

:= 0 

for k := 1 to n do 

C(:,j) := C(:,j) + B(k,j)A(:,k) 


6. Algorithm 3 is often used for fast general matrix multiplication on vector machines 
since it is based on a natural vector operation. If these vector operations can be per- 
formed on all elements simultaneously, then 0(np) vector operations are needed. 

7. Access to matrix elements in Algorithm 3 is by column. There are other rearrange- 
ments of the algorithm which access matrix information by row. 

8. Fast multiplication on array processors: Algorithm 4 multiplies two n x n (or smaller 
dimension) matrices on a computer with annxn array of processors. It uses various 
shift operations on the arrays and the array-multiplication operation (*) of elementwise 
multiplication. 

9. At each step Algorithm 4 shifts A one place to the left and shifts B one place up 
so that components of the array product are correct new terms for the corresponding 
elements of C = AB. Each matrix is preshifted so the first step complies with this 
requirement . 

10. Two n x n matrices can be multiplied in 0(n) time using Algorithm 4 on an array 
processor. 

11 . The Strassen algorithm: Algorithm 5 recursively carries out matrix multiplica- 
tion for n x n matrices A and B where n = 2 k . The basis of Strassen’s algorithm is 
partitioning the two factors into square blocks with dimension half that of the original 
matrices. 

12. Strassen’s algorithm ultimately requires the fast multiplication of 2 x 2 matrices 
(Algorithm 6). 

13. Algorithm 6 multiplies two 2x2 matrices using only 7 multiplications and 18 
additions instead of the normal 8 multiplications and 4 additions. For most modern 
computers saving one multiplication at the cost of 14 extra additions would not represent 
a gain. 

14. Strassen’s algorithm can be extended to n x n matrices where n is not a power 
of 2. The general algorithm requires O(n log27 ) « 0(n 2807 ) multiplications. Details of 
this algorithm and its efficiency can be found in [GoVa96] . 
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Algorithm 4: Array processor matrix multiplication. 

input: n x n matrices A , B 
output: n x n matrix C = AB 

{Preshift the matrix arrays} 

for i := 1 to n do 

shL(A(i, :), i — 1) {Shift ith row i — 1 places left} 

shU(B(:, i), * — 1) {Shift <tli column i — 1 places up} 

C := 0 {Initialize product array} 
for k := 1 to n do 
C:=C+A*B 
shL(A, 1) 
shU (B, 1) 


Algorithm5: Strassen’s algorithm for 2 k x2 k matrices, 
procedure Strassen(A , B) 

input: 2 k x 2 fc matrices A , B 
output: 2 fc x 2 fc matrix C = AB 


if k = 1 then use Algorithm 6 
else 

partition A , B into 4 2 fc_1 x 2 fc_1 blocks A = 


( An 

V^21 


A-12 

A 2 2 


B = 


B 11 
B21 


B r2 \ 
B22 ) 


P := Strassen((An + A 22 ), (Bn + B 22 )) 

Q := Strassen((A 2 i + A 22 ), B n ); R := Strassen(An, (B 12 - B 22 )) 

S := Strassen(A 2 2 , (B 2 1 — Bn)); T := Strassen((Au + Ai 2 ), B 22 ) 

U := Strassen((A 2 i - An), (Bn + B i2 )) 

V := Strassen((Ai 2 — A 22 ), (B 2 i + B 22 )) 

C n := B + S — T -\-V\ C 12 '- = R-\-T\ C 21 • = Q-\- S', C 22 : = P — Q P R-\-U 

end 

r = (Cn Ci 2 \ 

VC 21 C 22 J 


Algorithm 6: Strassen’s algorithm for 2x2 matrices. 

input: 2x2 matrices A, B 
output: 2x2 matrix C = AB 

P := (an + a 22 )(&n + b 22 )\ q ■= (a 2 1 + a 22 )&n; r := an (612 - b 22 ) 
s := 022(^21 ^ &11); t := (an + 012)^22! u := (021 — an)(6n + 612) 
v := (a 12 — 022) (&21 + ^22) 

C 11 := p P s - t P v; c 12 := r P t; c 2 i := q P s; c 2 2 := P - q P r P u 
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Examples: 


1. This example illustrates Algorithm 4 for 4x4 array matrix multiplication. The 
preshift and the first array multiplication yield the arrays: 


on 

«12 

Ol 3 

014 

6ll 

622 

^33 

644 

O11611 

012622 

O13633 

014644 

0-22 

»23 

«24 

«21 

621 

^32 

^43 

614 

022621 

023632 

O24643 

O21 614 

0-33 

«34 

«31 

032 

631 

bi2 

bl3 

624 

O33631 

034^42 

O31613 

032624 

044 

041 

042 

043 

641 

6l2 

623 

634 

044641 

O41612 

O42623 

O43634 


The next shifts and multiply-accumulate operation produce: 


O12 

Ol3 

Ol 4 

Oil 

621 

632 

643 

614 

023 

024 

021 

022 

631 

642 

613 

624 

O34 

031 

O32 

033 

641 

6 l 2 

623 

634 

041 

042 

O43 

O44 

6 ll 

622 

633 

644 


dllbll + Oi2&21 O12622 + O13632 O13633 + O14643 014644 + 011614 

022621+023631 023632 + 024642 024643 + 021613 021614 + 022624 

033631+034641 034642 + 031612 031613 + 032623 032624 + 033634 

044641+041611 041612 + 042622 042623 + 043633 043634 + 044644 

At subsequent stages the remaining terms get added in to the appropriate elements 
of the product matrix. The total cost of matrix multiplication is therefore reduced to 
n parallel multiply-accumulate operations plus some communication costs which for a 
typical distributed memory array processor are generally small. 


2 . Algorithm 6 is illustrated using the matrices A 



S =(I — 3 )™“ 


p = 5 • 4 = 20, g=1.7 = 7, r = 3 • 6 = 18, s = 3 • (-6) = -12, 

t = 7 • (-3) = -21, u = (-4) • 10 = -40, v = 2- (-2) = -4, 

giving the following elements of C = AB: C\\ = 20—12+21—4 = 25, C 12 = 18—21 = —3, 
c 2 i = 7 - 12 = -5, c 22 = 20 - 7 + 18 - 40 = -9. 


6.3.4 DETERMINANTS 
Definitions: 

For an n x n matrix A with n > 1, Ay denotes the [n — 1) x (n — 1) matrix obtained 
by deleting row i and column j from A. 

The determinant det A of an n x n matrix A can be defined recursively: 

• if A = ( a ) is a 1 x 1 matrix, then det A = a; 

• if n> 1, then det A = l)- 7+1 aij det Ay. 

A minor of a matrix is the determinant of a square submatrix of the given matrix. A 
principal minor is the determinant of a principal submatrix. 

Notation : The determinant of A = (a© is commonly written using vertical bars: 

a n ai2 • • • ain 
021 a 22 • • ' 0,2n 

Oni n n 2 * * * 


det A = |A| = 
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Facts: 

1. Laplace expansion: For any r, 

n n 

detA = ( — l) r+ ' J a r j det A r j = J^(— l)* +r aj r det A, r . 

j = i *= i 

2. If A = (ay) is n x n, then det A = J2cr & s n s g n ( <7 ) ai< 7 (i)a 2c r( 2 ) • • • 0W(n)- Here S„ is 
the set of all permutations on {1,2, . . . ,?r}, and sgn(cr) equals 1 if cr is even and —1 if 
cr is odd (§5.3.1). 

„ . , / a b\ la. fo I 


= ad — be. 


f = d e f = aei + bfg + cdh — afh — bdi — ceg. 


3. det 


4. det 


5 . det AB = det A det B = det BA for all n x n matrices A, B. 

6. det A T = det A for all n x n matrices A. 

7 . det aA = a n det A for all n x n matrices A and all scalars a. 

8. det 1=1. 

9 . If A has two identical rows (or two identical columns), then det A = 0. 

10 . Interchanging two rows (or two columns) of a matrix changes the sign of the de- 
terminant. 

1 1 . Multiplying one row (or column) of a matrix by a scalar multiplies its determinant 
by that same scalar. 

12. Adding a multiple of one row (column) to another row (column) leaves the value 
of the determinant unchanged. 

13 . If D = ( dij ) is an n x n diagonal matrix, then det D = dud 22 ■ ■ ■ d nn . 

14 . If T = ( tij ) is an n x n triangular matrix, then det T = tiit 2 2 • • • t nn . 

15 . If A and D are square matrices, then det ^ ^ -D ) = ^ ^ ^ ^ = ^ ( C D ) ' 

16 . A is nonsingular if and only if det A ^ 0. 

17 . If A is nonsingular then clet(A _1 ) = — . 

Q6t 

18 . If A and D are nonsingular, then det = cletAdet(Z) — CA~ 1 B) = 

det D det (A — BD ~ 1 C). 

19 . The determinant of a Hermitian matrix (§6.3.1) is real. 

20 . The determinant of a skew-symmetric matrix (§6.3.1) of odd size is zero. 

21 . The determinant of an orthogonal matrix (§6.3.1) is ±1. 

22 . The n x n symmetric (or Hermitian) matrix A is positive definite if and only if all its 
leading principal submatrices A[l], A[l, 2], . . . , A[l, 2 ,n\ have positive determinant. 

23 . The n x n Vandermonde matrix 

/l x\ ... ^ _1 \ 

i X2 ... xr 1 


17. If A is nonsingular then det (A : ) = 


18. If A and D are nonsingular, then det 


= det A det D = det 


A 0 
C D 


= det A det (H — CA 1 B) = 


has determinant Y\i<j( x i — x i )• 
© 2000 by CRC Press LLC 



24 . If the n x n matrix A = ( ) has diagonal elements an = x and off-diagonal 
elements = y, then det A = (x — y) n ~ 1 (x — y + ny). 

25 . The equation of the straight line through points (ai,6i) and (02,62) is given by 


x y 1 
a± bi 1 
a 2 b 2 1 


= 0. 


26 . The equation of the circle through points (ai, 61), (02, 6 2 ), (03, 63) is given by 


x 2 + y 2 

X 

y 

1 

a 2 + b\ 

ai 

61 

1 

a 2 + b\ 

a 2 

62 

1 

a 3 + 63 

a 3 

63 

1 



27 . If the three points (ai,6i), (02,62), (03,63) are listed in counterclockwise order, 
then the area of the triangle they form is given by 

Ol 61 1 

\ a 2 6 2 1 . 

a 3 63 1 


28 . The parallelepiped P = { aqai + a 2 a 2 + • • • + a n a n | 0 < a, < 1 } spanned by the 
vectors ai,a 2 , . . . ,a n has volume | det A |, where A has columns 01, 02, . . . , a n . 

29 . Computation: The determinant is (almost) never computed from the definition 
or from Fact 2. Instead it is calculated using Facts 12 and 14. (See Example 1.) 


Examples: 

1. Determinants can be calculated by using row operations to create a triangular ma- 
trix, and then applying Fact 14: 

/-I 2 1\ /-I 2 l\ / -1 2 l\ 

det 0 5 2 = det 0 5 2 = det 0 5 2 = -10. 

V 3 4 3/ \ 0 10 6/ \ 0 0 2 y 

Here the second matrix is obtained from the first by adding 3 times row 1 to row 3; the 
third matrix is obtained from the second by adding —2 times row 2 to row 3. 

2 . Determinants can be calculated by using row and column interchanges to obtain a 
form with exploitable zeros: 


/4 5 
0 6 
0 5 
\3 3 




1 5 6\ 
0 6 3 1 
0 5 2 I 

2 3 4/ 



1 5 6\ 

2 3 4 1 
0 5 2 I 
0 6 3/ 


Here the second matrix is obtained from the first by interchanging columns 2 and 3; 
the third matrix is obtained from the second by interchanging rows 2 and 4. The third 


matrix has block triangular form, with diagonal blocks A 
By Fact 15, the original determinant equals det A det D 


4 1 
3 2 


J and D = 
15. 


5 

6 


2 

3 


'3 1 O' 

3 . The symmetric matrix A = [ 1 5 3 

.034 


is positive definite, since its leading prin- 


/ 3 x \ 

cipal minors (Fact 22) are positive: det ( 3 ) = 3 > 0, det I ^ j = 14 > 0, and (by 

/ r n\ /1 n\ 

Fact 1) det 4 = 3 det 


5 3 
3 4 


— det 


= 3- 11-4 = 29 >0. 
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4. The equation of the line through points (1, 3) and (4, 5) can be found using Fact 25: 


X 

y 

1 


X 

y 

1 


X 

y ~\ 

i 


1 

3 

1 

= 

1 

3 

1 

= 

1 

2 

3 

i 

= il x ~y+ !)(— 3) = o 

4 

5 

1 


0 

—7 

-3 


0 

0 

-3 



giving \x — j/+|=0orj/=|a;+|. 

5. By Fact 27, the area of the triangle formed by the points (0,0), (1,3), and (4,5) is 


0 0 
4 5 
1 3 


1 

1 

1 


4 5 
1 3 


7 

2 ' 


6. Cayley’s formula: The determinant of the (n — 1) x (n — 1) matrix 


/ n — 1 

-1 .. 

. . -1 

-1 

n — 1 . . 

. . -1 

V -1 

-1 .. 

. . n — 


counts the number of spanning trees of a complete graph. (See §9.2.2.) Using Fact 24, 
detT n = n n ~ 2 [n — {n — 1)] = n n ~ 2 . 


6.3.5 RANK 

Definition: 

The rank of an m x n matrix A, written rank A , is the size of the largest square 
nonsingular submatrix of A. 

Facts: 

1. rank A = rank A T . 

2. The rank of A equals the maximum number of linearly independent rows or linearly 
independent columns in A. 

3. rank {A + B) < rank A + rank B. 

4. rank AB < minjrank A, rank#}. 

5. If A is nonsingular then rank AB = rank B and rank C A = rank C. 

6. rank A = dim C/S (A), where CS(A) is the column space of A and dim V denotes the 
dimension of the vector space V. (See §6.1.3.) 

7. rank A = dim!iS(A), where RS(A) is the row space of A. (See §6.1.3.) 

8. An n x n matrix A is nonsingular if and only if rank A = n. 

9. Every matrix of rank r can be written as a sum of r matrices of rank 1. 

10. If a and b are nonzero n x 1 vectors, then ab T is an n x n matrix of rank 1. 

11. The rank of a matrix is not always easy to compute. In the absence of severe 
roundoff errors, it can be obtained by counting the number of nonzero rows at the end 
of the Gaussian elimination procedure (§6.4.2). 
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12 . The rank of a matrix is not always easy to compute. In the absence of severe 
roundoff errors, it can be obtained by counting the number of nonzero rows at the end 
of the Gaussian elimination procedure (§6.4.2). 

13 . System of linear equations: Consider the system Ax = 6, where A is m x n. Let 
Ab = (A: b) denote the m x (n + 1) matrix whose (n+ l)st column is the vector b. Then 
the system Ax = b has 

• a unique solution rank A = rank A/, = n; 

• infinitely many solutions rank A = rank A b < n; 

• no solution yy rank A < rank Ab. 


Examples: 

f 1 - 1 2 \ 

1. The matrix A = 3 4 —1 I is singular since detH = 0. However, the sub- 

\5 2 3/ 

matrix A [1,2] = ^ ^ ^ has determinant 7 and so is nonsingular, showing that 

rank A = 2. The matrix A has two linearly independent rows: row 3 = 2 x (row 1) + 
(row 2). Likewise, it has two linearly independent columns: column 3 = (column 1) — 
(column 2). This again confirms (by Fact 2) that rank A = 2. 


2. Consider the system of equations Ax = 6, where A is the matrix in Example 1 and 
b = (0, 7, 7) t . Since rank A = rank Ab = 2 < 3, this system has infinitely many solutions 
x. In fact, the set of solutions is given by { (1 — a, 1 + a, a) T | a € TZ }. 


3. The matrix A = 


x 

™2 


X 

™2 


the column vector ( l,x,x 2 ) 


2 \T 


x 2 \ 

a: 3 can be expressed as the product aa T where a is 

x 4 J 

By Fact 10, A has rank 1. 


6.3.6 IDENTITIES OF MATRIX ALGEBRA 


Facts: 


1. Cauchy-Binet formula: If C is m x m and C = AB where A is m x n and B is 
n x m, then the determinant of C is given by the sum of all products of order m minors 
of A and the corresponding order m minors of B: 


det C = 



ai S i 

tt lS2 

■ ' a u m 


b Sl i 

bs i2 

^sim 

E 

l<Sl<S2<-"<Sm<Tl 

&2si 

a 2 s 2 

' ' a 2s m 


b S2 i 

b S2 2 



O j ms2 

MmSm 


Li 

bs m 2 • 

bsmm 


• if m = n there is only one possible selection; the Cauchy-Binet formula for this 

case reduces to det C = det A det B\ (see Fact 5, §6.3.4) 

• if m > n no possible selections exist (the sum is empty), so det (7 = 0. 


2 . Courant-Fischer minimax identity: 
matrix A are ordered so that Ai > A 2 
where V is a linear subspace of C n . 


If the eigenvalues (§6.5) of an n x n Hermitian 

x^ Ax 

> ■ ■ ■ > A n , then Ak = max min — ^ — 

dim V=k O^xeV X 1 X 
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3. Hadamard’s inequality: This gives an upper bound for the determinant of an nx n 
matrix A in terms of the I 2 norms (§6.4.5) of its rows (or columns): 

n / n \ n 

• in terms of rows: (det A) 2 <n d»«i 2 -iiiwmii* 

2=1 \j = l / 2=1 

n / n \ n 

• in terms of columns: (det A) 2 sn em 2 =niwo)n 2 - 

i= 1 Vi— 1 / j = 1 


4. Sherman-Morrison identity: If A is a nonsingular nx n matrix and u, v € 7 Z n , then 

{A - uv T )~ 1 = A -1 + ^ (A~ 1 uv T A~ 1 ). 

v ' 1 -v T A~ 1 u 


5. Woodbury identity: If A is nonsingular, then 

(A - UV T )~ 1 = A" 1 + A~ 1 U(I - V T A- 1 U)~ 1 V T A- 1 . 


6. Suppose A is a nonsingular nxn matrix, with S a set of k indices i\ < < • • • < ik 

and S the set of remaining indices in {1,2, . . . ,n}. Then the principal minors of A 
are related to the principal minors of A via 

det A -1 [S'] = -^V_detA[S]. 


dx 

7. Jacobi’s identity: If the nxn system of linear differential equations — = P(t)x 

has the linearly independent family of solutions for j = 1,2, ... ,n, then the 

determinant of the (variable) matrix X(t) whose columns are the X(:,j)(t) is given by 


det X(t\) = c exp ( J)* 1 tr P(t)dtj , 

where c is a constant and tr P(t) = pn(t) + P22{t) + • • • + p n n(t ) is the trace of the 
matrix P(t). 


6.4 LINEAR SYSTEMS 


The need to find solutions of linear systems arises in numerous branches of science and 
engineering (physics, biology, chemistry, structural engineering, electrical engineering, 
civil engineering) as well as statistics and applied mathematics. This section discusses 
various techniques for the efficient solution of such systems, especially important when 
these systems are large and sparse. 


6.4.1 BASIC CONCEPTS 

This subsection is concerned with representing and solving a system of to linear equa- 
tions in n unknowns. Throughout, the focus will be on systems whose data are real 
numbers. The extension to linear systems with complex data is straightforward. 
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Algorithm 1: Forward substitution. 

input: n x n nonsingular matrix L, n x 1 vector b 
output: nx 1 vector x = L~ 1 b 

h 

Xl := 77 

hi 

for i := 2 to n do 

x i ; = jr. ( b * - 5Z l a x i) 

” j=i 


Definitions: 

A linear equation in unknowns x\, aq, ■ ■ ■ , x n is an equation of the form ]P ” =1 a j x j = b, 
where the coefficients a,j £ TZ and the right-hand side b £ TZ. A solution of this 
equation is any set of values xi, X 2 , ■ ■ ■ , x n satisfying the given equation. 

A system of linear equations in unknowns x±, aq, . . . , x n is a collection of m equations 
Xq = i a ij x :i = bi, i = 1,2,..., to where all a.y £ 7 Z and all 6,; £ 7Z. A solution of 
this system is any set of values Xi,X 2 , ■ ■ ■ ,x„ satisfying (simultaneously) the m given 
equations. The coefficient matrix of this system is the to x n matrix A = (a t j), and 
the augmented matrix is the to x (n + 1) matrix Af, = (A: b). 

A homogeneous system has right-hand sides = 0 for all i = 1, 2, . . . , to; otherwise 
the system is nonhomogeneous. 

Back substitution is a simple and efficient iterative procedure for solving an upper 
triangular linear system Ux = b, one unknown at a time. 

Forward substitution is a simple and efficient iterative procedure for solving a lower 
triangular linear system Lx = b , one unknown at a time. 


Facts: 

1. A system of to linear equations in the unknowns Xi, X 2 , ■ ■ ■ , x n can be represented by 
the linear system Ax = b where A is the to x n coefficient matrix, x = (aq, aq, ■ . . , x n ) T 
is the column vector of unknowns, and b = (bi, 62 , • • • , b m ) T is the column vector of 
right-hand sides. 

2 . Given the linear system Ax = 6 , where A is m x n, 

• the system has no solution when rank A < rank A&; 

• the system has a unique solution when rank A = rank Af, = n; 

• the system has infinitely many solutions when rank A = rank Af, < n; in this case 

the set of solutions is an affine subspace of dimension n — rank A (§6.1.3). 

3. If the square matrix A is nonsingular (§6.3.2), then Aa: = b has the unique solution 
vector x = A _ 1 6 . 

4. If the square matrix L = ( ) is lower triangular, then Lx = b has a unique solu- 
tion whenever In ^ 0 for all i. In this case the solution can be found using forward 
substitution (Algorithm 1). 

5. If the square matrix U = (uq) is upper triangular, then Ux = b has a unique 
solution whenever u ri ^ 0 for all i. In this case the solution can be found using back 
substitution (Algorithm 2). 


© 2000 by CRC Press LLC 





Examples: 

1 . The system of linear equations 

x\ + 3x 2 + 4x 3 = 1 
3x\ + 5x2 + OX 3 = 7 


corresponds to the linear system Ax = b , where A = 


13 4 
3 5 0 


and b = 


. Since 


rank A = rankAf, = 2 < 3 the system has an infinite number of solutions. In fact the 


set of solutions can be expressed as { (4+ 5a, —1 — 3a, a) T \ a £ 1Z }. Equivalently, it can 
be expressed as the affine subspace { (4, —1, 0) T + a(5, —3, 1) T | a € 1Z } of dimension 


n — rank A = 3 — 2 = 1. 


2. The system of linear equations 

5xi — 3x2 + 4x 3 = 4 
— x 2 + 5x3 = 7 
3x3 = 6 


/ 5 " 3 4 \ 

has the upper triangular coefficient matrix { 7 = 0 —1 5 . Using Algorithm 2, the 

\0 0 3/ 

unique solution is X 3 = | = 2, x 2 = =[(7 — 5 • 2) = 3, X\ = |(4 — (—3) • 3 — 4 • 2) = 1. 


6.4.2 GAUSSIAN ELIMINATION 

Solving a system of linear equations via Gaussian elimination is one of the most common 
computations performed by scientists and engineers. Gaussian elimination successively 
eliminates variables from the original system, creating a triangular system that is easily 
solved (§6.4.1). 

Note: This subsection deals only with linear systems Ax = b , where A is a nonsingular 
n x n real matrix and b £ TZ" . 

Definitions: 

Gaussian elimination is a method for solving systems of linear equations; at each 
step one equation is used to eliminate one variable from the rest of the equations. The 
coefficient of the eliminated variable in the eliminated equation is the pivot. 
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Algorithm 3: Gaussian elimination and back substitution. 

input: n x n nonsingular matrix A, n x 1 vector b 
output: n x 1 vector x = A~ x b 

{Gaussian elimination} 
for j := 1 to n — 1 do 
{ a,jj is the pivot} 
for i := j + 1 to n do 

{eliminate Xj from equation *} 
compute multiplier m := —dij/ajj 
add in x row j to row i 

{Back substitution} 

use Algorithm 2 on resulting upper triangular matrix to obtain the values 

A) *£ro— 1? • • • t 


A flop is a multiply-add operation of the form t = s + ab , especially when performed 
in floating point arithmetic on a digital computer. 

Roundoff errors are the errors associated with storing and computing numbers in 
finite precision arithmetic on a digital computer. 

A numerically stable algorithm is a method whose accuracy is not greatly harmed 
by roundoff errors. 

A numerically unstable algorithm is a method that may return an inaccurate solu- 
tion even when the solution is relatively insensitive to errors in the data. 


Facts: 

1. Gaussian elimination is easily extended to linear systems for which the data A and b 
are complex. Extension to rectangular m x n linear systems is more involved, but not 
difficult [GoVa96]. 

2. In Gaussian elimination, the coefficient of a variable in one of the equations can be 
used as a pivot if and only if its value is nonzero. 

3. In practice, careful choice of pivots is needed to ensure accuracy or improve efficiency 
or both. (See Example 2.) 

4. Assume freedom at each step to choose any available nonzero pivot. Then Gaussian 
elimination succeeds (using exact arithmetic) if and only if A is nonsingular. 

5. Gaussian elimination transforms the initial linear system into a second linear system 
such that 

• the solutions of the two systems are identical; 

• the solution of the second system is easily obtained by back substitution. 

6. Ax = b can be solved by Gaussian elimination and back substitution (Algorithm 3), 
assuming that at each step there is a nonzero pivot on the main diagonal. 

7. Algorithm 3, implemented to take advantage of created Os, requires |n 3 + 0(n 2 ) 
flops. 

8. To solve Ax = b by computing the inverse and then forming the product A _1 fe 
requires n 3 + 0(n 2 ) flops. 


© 2000 by CRC Press LLC 




9. Cramer’s rule: This method for solving Ax = b expresses each entry of the solution 
x = (xi,x 2 , • • ■ , x n ) T as the ratio of two determinants: 

det A i det A 2 clet A n 

1 det A ’ 2 det A ’ ’ n det A 

where A t is obtained from A by substituting column vector b for the itli column of A. 
(Gabriel Cramer, 1704-1752) 

10 . Cramer’s rule is of extremely limited use numerically because 

• it requires far more flops than Gaussian elimination and back substitution; 

• it is numerically unstable. 

Examples: 

1. The following system is solved by first applying Gaussian elimination: 

X\ T x 2 T 2 x 3 T X4 = 1 

2xi + 4x2 + 5x3 + 4x4 = 5 (— \ x equation 1) 

Xi + 7x2 + 7x3 + 6x4 = 6 (— j x equation 1) 

2xi + 4x 2 + 9x 3 + 5x4 = 3 (— | x equation 1) 

Xl + x 2 + 2x3 + X4 = 1 

2x2 + X3 + 2x4 = 3 

6x2 + 5x3 + 5x’4 = 5 (— | x equation 2) 

2x 2 + 5x3 + 3x4 = 1 (— | x equation 2) 

Xl + x 2 + 2x3 + X4 = 1 

2x2 + X3 + 2x4 = 3 

2x 3 — X4 = —4 

4 x ’3 + X4 = — 2 (— | x equation 3) 

Xl + x 2 + 2x3 + X4 = 1 

2x2 + X3 + 2x4 = 3 
2x 3 — X4 = —4 
3x4 = 6 

The solution is then obtained by back substitution: 

X4 = 6/3 = 2, 

x 3 = [—4 H- 1 - 2]/2 = — 1, 

X 2 = [3 — 1 • (—1) — 2 - 2] / 2 = 0 , 

xi = [1 - 1 • 0 - 2 • (-1) - 1 • 2]/l = 1. 

2. Suppose the following system is solved, rounding all results to three significant digits: 

O.OOOlxi + X 2 = 1 

0.5xi + 0.5x2 = 1 (— o q 0 5 01 x equation 1) 

O.OOOlxi + X 2 = 1 

—5000x2 = -5000 
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Using back substitution produces Xi = 1 and X\ = 0. However, the correct solution to 
this simple linear system is x\ = 3°^° and X2 = §§§§, which to three significant digits 
becomes x\ = 1 and X 2 = 1- Consequently, simply choosing any nonzero pivot can 
produce inaccurate results. 


6.4.3 LU DECOMPOSITION 

Gaussian elimination can be formulated as LU decomposition of the coefficient matrix. 

Definitions: 

An LU decomposition of a square matrix A expresses A = LU, where L = (Z© is 
unit lower triangular and U = (u t j ) is upper triangular. 

A permutation matrix is a square matrix with entries 0 or 1, where the entry 1 occurs 
precisely once in each row and once in each column. 


Facts: 

1. A square matrix has an LU decomposition if and only if every principal submatrix 
(§6.3.1) is nonsingular. 

2. If P is a permutation matrix, then the product PA rearranges the rows of A and 
the product AP rearranges the columns of A. 

3. The matrix A is nonsingular if and only if there exists a permutation matrix P such 
that PA has an LU decomposition. The LU decomposition of PA is unique. 

4. It may be necessary to rearrange the rows of A to avoid a zero pivot. 

5. Assume A has an LU decomposition, and consider Gaussian elimination applied 
to Ax = b with pivots on the main diagonal. The following statements express LU 
decomposition as a reformulation of Gaussian elimination: 

• the entry Uij (1 < i < j < n) is the coefficient of Xj in equation i after Gaussian 

elimination has been completed; 

• to eliminate Xj from equation i, i> j, Gaussian elimination adds — x equation 

j to equation i. 

6 . If A has an LU decomposition, then the linear system Ax = b can be solved as 
follows (see Algorithm 4): 

• compute the decomposition A = LU ; 

• solve Ly = b; that is, perform forward substitution; 

• solve Ux = y\ that is, perform back substitution. 

7. It is inefficient to solve a nontrivial sequence of linear systems Ax \ = b\, Ax 2 = 62 , 
. . ., Ax p = b p by repeating Gaussian elimination for each system. Only one LU de- 
composition is needed, followed by p forward substitution steps and p back substitution 
steps. 

8 . An LU decomposition of an n x n matrix requires n 2 + 0(n) storage locations and 
|n 3 + 0(n 2 ) flops. 
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Algorithm 4: LU decomposition with forward and back substitution. 

input: n x n nonsingular matrix A, n x 1 vector b 
output: nxl vector x = A~ x b 

{Compute A = LU} 
for k := 1 to n — 1 do 

^ kk • = &kk 

for i := k + 1 to n do 
lik • O'ik/ ^kkt ^ ki • ^ ki 

for j := k + 1 to n do 
for * := k + 1 to n do 

dij . — (l jj l/J : - Ukj 

{Solve Ly = 6; that is, perform forward substitution} 

for i := 1 to n do 

i - 1 

Hi ■= h - ^2 l ijVj 
i = i 

{Solve Ux = y; that is, perform back substitution} 

for i := n down to 1 do 
1 n 

Xi := — Li - 22 u U x j) 

Uii j=i + 1 


Examples: 

1. The following matrix A has no LU decomposition because an = 0 (see Fact 1): 

0 1' 


A = 


2 3 


However, rearranging the rows of A (Fact 4) produces 


PA = 


= LU. 


2. The unique LU decomposition of the matrix A in Example 1, §6.4.2 is 


/l 

1 

2 

1\ 

/l 

0 

0 

°\ 

/l 

i 

2 

1 2 

4 

5 

4 1 - 

2 

1 

0 

0 1 

0 

2 

1 

1 

7 

7 

6 

1 

3 

1 

0 

0 

0 

2 

V 2 

4 

9 

5/ 

V 2 

1 

2 

1/ 

Vo 

0 

0 



6.4.4 CHOLESKY DECOMPOSITION 

For symmetric positive definite linear systems, Cholesky decomposition (which exploits 
symmetry in the coefficient matrix) is roughly twice as efficient as LU decomposition. 

Definition: 

A Cholesky decomposition of A expresses A = LL r , where L is lower triangular and 
every entry on the main diagonal of L is positive. 
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Facts: 


1. A matrix has a Cholesky decomposition if and only if it is symmetric and positive 
definite. 

2. When A is symmetric and positive definite, the linear system Ax = b can be solved 
as follows: 

• compute a Cholesky decomposition A = LL T \ 

• solve Ly = b ; i.e. , perform forward substitution; 

• solve L T x = y\ i.e., perform back substitution. 

3 . A simple symmetric variant of the standard LU decomposition algorithm is used to 
compute Cholesky decomposition [GoVa96, St88]. 

4 . Cholesky decomposition requires \ n 2 + 0(n) storage locations and |n 3 + 0(n 2 ) 
flops, in contrast to the n 2 + 0{n) storage locations and |n 3 + 0(n 2 ) flops required by 
LU decomposition. 


Example: 


1. The matrix A = 



-1 

2 

-1 


its principal submatrices ( 1 ) , 



is clearly symmetric. It is positive definite since 
and A have positive determinants. (See 


§6.3.4 Fact 22.) Matrix A can be written as A = LL T , where L is the lower triangular 

/ 1 ° °\ 

matrix —1 1 0 . To solve the linear system Ax = &, with b = (1,0, 6) T , first 

V 3 2 1/ 


solve the lower triangular system Ly = b , yielding y = (1, 1, 1) T . Then solve the upper 
triangular system L T x = y , yielding x = (—3, —1, 1) T . 


6.4.5 CONDITIONING OF LINEAR SYSTEMS 

Errors in the data A and b lead to errors in the solution x. The condition number of A 
can be used to bound relative error in the solution in terms of relative errors in the 
data. 

Definitions: 

A ( generalized ) vector norm on lZ n is a real- valued function || • | satisfying the 
following properties for all real scalars a and all vectors x,y G lZ n : 

• H^ll > 0 with equality if and only if x = 0; 

• 1 1 cur 1 1 = |a| • ||a:||, where \a\ denotes the absolute value of a; 

• \\x + y\\ < ||a:|| + ||y||. 

The matrix norm induced by the vector norm || • || is defined by ||A|| = max ||Aa:||. 

11 * 11=1 

The condition number of a nonsingular matrix A is the number k(A) = ||A|| ||A -1 ||. 
The larger the condition number of a matrix, the more ill conditioned it is; the smaller 
the condition number of a matrix, the more well conditioned it is. 
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Facts: 


1 . The definition of a vector norm given here generalizes that of a vector norm derived 
from an inner product space (§6.1.4). 

2. The matrix norm induced by a vector norm satisfies: 

• H-X'll > 0 with equality if and only if X = 0; 

• ||aX|| = |a| • ||X||, where |a| denotes the absolute value of a; 

• \\X + Y\\ < ||X|| + ||F||; 

• ||XF||<||x||||y||. 

3. k{A) > 1. 

4. Consider the linear system Ax = b ^ 0, where A is nonsingular. Suppose that 
changing from A to A + A A and b to b + Ab changes the solution from x to x + Ax. If 
||(AA)| || A -1 1| < 1, the relative error in x can be bounded in terms of relative errors in 

f np /-| O E O • 

l|Ax|| / k(A) \ f \\Ab\\ ||AA|| \ 

11*11 - U-II(AA)||||A-1||A IN Mil )' 


5. The following are consequences of Fact 4: 

• for an ill-conditioned linear system Ax = b , some small errors in A or b can 

potentially be amplified into large errors in x; 

• for a well-conditioned linear system Ax — b , where 1 — ||(AA)| ||^4 -1 || is not 

approximately zero, a 11 small errors in A or b result in no more than modest 
errors in x. 


6. Assume A is nonsingular, let Ax = b ^ 0, and view x as an approximation to the 
solution x. Then the residual r = Ax — b and the error x — x satisfy: 


ll*~ *11 

11*11 




7. Whenever A is ill conditioned, a small relative residual ||r||/||6|| may not imply a 
small relative error ||x — x||/||x||. 


Examples: 

1. The standard Euclidean norm (§6.1.4) on 7 Z n defined by ||ar|| 2 
(generalized) vector norm. 


(E *?) 1/2 


»= 1 


is a 


2. The 1 1 norm on lZ n defined by ||x||i = El |*»| is a (generalized) vector norm. 

i=l 

3. In coding theory (§14.1), the Hamming distance between two codewords x,y £ 
is just ||x-y||i. 


4. The Zqo norm on lZ n defined by ||cc|| 00 = max |xj| is a (generalized) vector norm. 

l<i<n 


n 

5. The matrix norm induced by ||x||i is given by ||A||i = max E l a ijl- 

n 

6 . The matrix norm induced by ||x||oo is given by HAHoo = max E \ a ij\- 

l<z<n j—\ 

7. The matrix norm induced by ||x||2, also called the spectral norm, is given by || A||2 = 
max{ y/X | A an eigenvalue of A T A }. 
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8 . Consider the linear system Ax = b , where A = 


1 

2.001 


and b = 


Then 


IIAIIoc = 6.001 and ||A _1 || 00 = 3000, so k{A) = 18,003. The solution of Ax = b is 

x = (0, 1) T whereas the solution of the slightly perturbed system with b = ^ 

is x = (1,0. 5) t . Even though the change in the right-hand side is small, the large 
condition number allows for radical changes in the solution vector, as seen here. 


6.4.6 PIVOTING FOR STABILITY 


Gaussian elimination can be numerically unstable. Numerical stability can be vastly 
improved by the addition of pivoting strategies that select large pivots. 


Definitions: 

(k) 

Let ; denote the ij-entry of the current matrix after step k of Gaussian elimination 


(or LU decomposition). The growth factor is defined by 


max | a, 

i,j,k 


( fc )| 


max a, 


Partial pivoting is a solution strategy which at step k of Gaussian elimination ex- 
changes row k with the row i > k having the entry of largest magnitude in column k. 

Complete pivoting is a solution strategy which at step k of Gaussian elimination 
exchanges row k and column k with, respectively, the row i > k and the column j > k 
containing the entry of largest magnitude. 


Facts: 

1. For general coefficient matrices, Gaussian elimination (that is, LU decomposition) 
without pivoting is numerically unstable. 

2 . To improve the numerical stability of Gaussian elimination, it suffices to introduce 
a pivoting strategy that keeps the growth factor small. 

3 . For Gaussian elimination with complete pivoting the growth factor is bounded above 
by 

n 1 / 2 (2 1 3 1 / 2 4 1 / 3 . . . n 1 /!™ -1 )) 1 / 2 , 

which is a relatively slow-growing function of n; hence, Gaussian elimination with com- 
plete pivoting is numerically stable. 

4 . For Gaussian elimination with partial pivoting, the growth factor is bounded above 
by 2™ _1 , and moreover there are contrived examples for which the growth factor is 2 n ~ 1 . 
Hence, Gaussian elimination with partial pivoting can be numerically unstable. 

5 . In practice, partial pivoting is preferred over complete pivoting for the following two 
reasons: 

• despite contrived examples having an exponential growth factor, partial pivoting 

limits the growth factor in practice almost as well as complete pivoting; 

• partial pivoting is significantly more efficient than complete pivoting; it compares 

\n 2 + 0(n) pairs of potential pivots, while complete pivoting compares |?r 3 + 
0(?r 2 ) pairs. 
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Example: 

1. LU decomposition applied to the following matrix shows that partial pivoting can 
produce a growth factor of 2 n_1 (see Fact 4). Observe that max j;j | a.^- 1 = 1 and 
maxj^fc \a^\ = u nn = 2 n_1 ; hence the growth factor is 2 n_1 : 


( 1 

0 0 • • 

0 

1\ 

-1 

1 0 • • 

0 

1 

-1 - 

-1 1 •• 

0 

1 

-1 - 

-1 -1 •• 

• 0 

i 

-1 - 

-1 -1 •• 

1 

l 

V-i - 

-1 -1 •• 

• -1 

1/ 


6.4.7 PIVOTING TO PRESERVE SPARSITY 

Many, if not most, linear systems that arise in practice have relatively few nonzero 
entries in the coefficient matrix. Some pivoting strategies aim to preserve many zero 
entries in the triangular factors; the LU decomposition algorithm can then save time 
and space by leaving zero entries out of the computation. 

Definitions: 

A matrix is sparse if it has relatively few nonzero entries. The number of nonzero 
entries of matrix A is denoted |A|. The ith row of A is denoted A(i, :) and the jth 
column of A is denoted A(:,j). (See §6.3.1.) 

Fill refers to nonzero entries in the triangular factors whose corresponding positions in 
the coefficient matrix are occupied by zeros. 

The upper bandwidth and lower bandwidth of a matrix A are given respectively 
by ub(A) = max{ ( j - i) \ a ^ ^ 0, * < j }, lb(A) = max{ (i - j) | a,y ^ 0, * > j }. 

A banded LU decomposition algorithm stores and computes all entries of L and U 
within the band defined by lb (A) and ub(A). 

A general sparse LU decomposition algorithm stores and computes only the nonzero 
entries in the triangular factors, irrespective of the banded structure. 

The Markowitz pivoting strategy for Gaussian elimination chooses at step k from 
among all available pivots one that minimizes the product (| L(:, k)\ — l) (| U(k, :)| — l). 

The minimum degree algorithm is a restricted version of the Markowitz pivoting 
strategy; it assumes (and preserves) symmetry in the coefficient matrix. At step k 
of Gaussian elimination, this algorithm chooses from among the entries on the main 
diagonal a pivot that minimizes |L(:,fc)|. 

Note : The realistic “no-cancellation” assumption will be made throughout. Namely, 
once an entry becomes nonzero during a triangular decomposition, it will be nonzero 
upon termination. 

Facts: 

1. The amount of fill in triangular factors often varies greatly with the choice of pivots. 

2. Under the no-cancellation assumption, bandwidth reduction and fill reduction be- 
come combinatorial optimization problems. 


© 2000 by CRC Press LLC 



3 . The following problems are provably intractable (i.e., NP-hard; see §16.5): 

• for a symmetric matrix A, find a permutation matrix P that minimizes the 

bandwidth lb(PAP T )-, 

• for a nonsingular matrix A, find permutation matrices P and Q such that the 

LU decomposition PAQ = LU exists and \L\ + \U\ is minimum; 

• for a symmetric positive definite matrix A, find a permutation matrix P that 

minimizes \L\, where L is the Cholesky factor of PAP T . 

4 . In view of Fact 3, various heuristics are used to reduce bandwidth or to reduce fill. 

5. Assume that A has an LU decomposition. Then lb(L) = lb (A) and ub(E7) = ub(A). 

6. The chief advantage of a banded LU decomposition algorithm over a general sparse 
LU decomposition algorithm is its simplicity. The same advantage holds for profile and 
skyline methods, both of which are generalizations of the banded approach [GeLi81]. 

7. For most problems encountered in practice, a banded LU decomposition algorithm, 
even if A has been permuted so that lb(A) and ub(A) are minimum, requires much more 
space and work than a general sparse LU decomposition algorithm coupled with the 
Markowitz pivoting strategy. The same comment applies to profile and skyline methods. 

8. Let A be a symmetric positive definite matrix, and let P be a permutation matrix 
with the same number of rows and columns. 

• the Cholesky decomposition of PAP T exists and is numerically stable; 

• the undirected graph (§8.1) G of the Cholesky factor of PAP T is a chordal graph 

and P defines a perfect elimination ordering of G [GeLi81]. 

9. General sparse Cholesky decomposition can be handled in a clean, modular fashion: 

• using only the positions of nonzeros in A as input, compute a permutation P to 

reduce fill in the Cholesky factor of PAP T (using, for example, the minimum 
degree algorithm); 

• construct data structures to contain the nonzeros of the Cholesky factor; 

• after putting the nonzero entries of PAP T into the data structures, compute the 

Cholesky factor of PAP T in the provided data structures; 

• perform forward and back substitutions to solve the linear system. 

10 . For symmetric positive definite matrices arising from two-dimensional and three- 
dimensional partial differential equations, the nested dissection algorithm often com- 
putes a more effective fill-reducing permutation than does the minimum degree algo- 
rithm [GeLi81]. 

11 . The interplay between pivoting for stability and pivoting for sparsity complicates 
general sparse LU factorization. The best approach is not yet certain. 

12. A number of robust and well-tested software packages are available for solving 
linear systems, including: 

• LINPACK: a collection of Fortran routines for relatively small dense systems; see 

http : // www . netlib . org 

• LAPACK/CLAPACK: supersedes LINPACK, contains Fortran and C routines 

for dense and banded problems, ideal for shared- memory vector and parallel 

processors; see 

http : // www . netlib . org 

• NAG: Fortran and C libraries for dense and sparse systems; see 

http : // www . nag . com 
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• IMSL: Fortran and C libraries for dense and sparse systems; see 

http : //www. vni . com/products/ imsl 

• MATLAB : high-level language for dense and sparse systems; see 

http : // www . mathworks . com 


Examples: 

1. For any “arrowhead” matrix there is a pivot sequence that completely fills the matrix 
and another that creates no fill, making it the canonical example used to illustrate Fact 1. 
The following is a 4 x 4 arrowhead matrix that fills in completely. (* occupies a position 
that is nonzero in A, • is a fill entry in L or [/, and a space is a zero.) 


f k k * *\ 


/* \ 

* -k 

II 

t~s 

II 

k k 

k k 


k . k 

\k k) 


\k • • k/ 


Reversing the pivot sequence, however, results in no fill: 




2. The following table illustrates how Fact 7 typically manifests itself in practice. The 
four problems arise in finite element modeling of actual structures. The table records 
data for two distinct methods: 

• a profile-reducing permutation from the reverse Cuthill-McKee algorithm [GeLi81] 

in tandem with a profile factorization algorithm; 

• a fill-reducing permutation from the minimum degree algorithm [GeLi81] in tan- 

dem with a general sparse factorization algorithm. 

Recorded for each method are the number of nonzero entries in the Cholesky factor 
(expressed in millions) and the number of flops needed to compute the factor (expressed 
in millions). 


problem 

n 

\A\ 

|T|(xl 

profile 

reduction 

0 -6 ) 

general 

sparse 

No. flops 

profile 

reduction 

(xlO -6 ) 

general 

sparse 

coliseum 

1,806 

63,454 

0.190 

0.112 

11.803 

4.952 

winter sports 
arena 

3,562 

159,910 

0.538 

0.279 

44.245 

16.352 

nuclear power 
station 

11,948 

149,090 

5.908 

0.663 

2,135.163 

70.779 

76 story 
skyscraper 

15,439 

252,241 

2.637 

1.417 

232.791 

142.567 
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6.5 EIGENANALYSIS 


Identifying the eigenvalues and eigenvectors of a matrix facilitates the study of com- 
plicated systems and the analysis of their behavior over time. A basis consisting of 
eigenvectors yields a particularly simple representation of a linear transformation (§6.2). 
Eigenvalues can also provide useful information about discrete structures (§8.10.1). 


6.5.1 EIGENVALUES AND CHARACTERISTIC POLYNOMIAL 


Definitions: 

A complex number A is an eigenvalue of the n x n complex matrix A if there exists a 
nonzero vector x £ C n (an eigenvector of A corresponding to A) such that Ax = Ax. 

The characteristic polynomial of the square matrix A is the polynomial pa( A) = 
det(A / — A). 

The characteristic equation of A is the equation pa( A) = 0. 

A nilpotent matrix is a square matrix A such that A k = 0 for some positive integer k. 
An idempotent matrix is a square matrix A such that A 2 = A. 

Let Sk(A) denote the sum of all order k principal minors of the matrix A. 


Facts: 


1. The characteristic polynomial pa( A) of an n x n matrix A is a monic polynomial of 
degree n in A. 

2. The coefficient of A" -1 in pa( A) is — tr A. 

3. The constant term in pa( A) is (— l)"det A. 

4- Pa( A) = ELo(-! ) k S k (A)X"~ k . 

5. Similar matrices (§6.2.4) have the same characteristic polynomial. 

6. The roots of the characteristic equation are the eigenvalues of A. 

7. Cay ley -Hamilton theorem : If Pa(-) is the characteristic polynomial of A then pa(A) 
is the zero matrix. 

8. An n x n matrix has n (not necessarily distinct) eigenvalues. 

9. The matrix A is singular if and only if 0 is an eigenvalue of A. 

a b 
c d 


10. The characteristic equation of A = 


1 1 . The eigenvalues of A = 


are given by 


is pa{ A) = A 2 — (a + d ) A + (ad — be), 
a + d± sj (a — d) 2 + 4 be 


12. If the n x n matrix A has eigenvalues Ai, A 2 , . . . , A n then 

• E"=i^» = tr A; 

• E[r=i A * = det ^ 

• the A:-th elementary symmetric function Ei!<...<i A n ■ ■ ■ equals Sk(A). 
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13. The eigenvalues are continuous functions of the entries of a matrix. More precisely, 
given an n x n matrix A with eigenvalues Ai, A 2 , ■ . . , X n and e > 0, there exists 6 > 

0 such that for any n x n matrix B, with eigenvalues /ii, p, 2 , ■ ■ ■ , fJ-n and satisfying 
maxjj \cLij — bij\ < 6, there exists a permutation t of 1, 2, n such that | A* — /r T (p| < e, 

1 = 1, 2, . . . , n. 

14. The following table gives the eigenvalues of certain specialized matrices A , whose 
eigenvalues are Ai, A 2 , . . . , X n . In this table k is any positive integer. 


matrix 

eigenvalues 

diagonal matrix 

diagonal elements 

upper (or lower) triangular matrix 

diagonal elements 

A T 

eigenvalues of A 

A* 

complex conjugates of the 
eigenvalues of A 

A k 

\k \k 

Ai, . . . , A n 

A~ k , A nonsingular 

\-k \-k 

A 1 , . . . , A n 

q(A), where q(-) is a polynomial 

q(X 1 ),...,q(X n ) 

SAS -1 , S nonsingular 

eigenvalues of A 

AB , where A is to x n, B is n x to, 

eigenvalues of BA ; and 0 

m > n 

(to. — n times) 

(a — b)I n + bJ n , where J n is the n x n 

a + (n — l)b~ and a — b 

matrix of all Is 

(n — 1 times) 

A n x n nilpotent 

0 (n times) 

A n x n idempotent of rank r 

1 (r times); and 0 (n — r times) 


Examples: 

1. The characteristic polynomial for A = 


1 4 

2 3 


is p A ( A) = 


A — 1 -4 

-2 A - 3 


= A 2 - 


4A — 5 = (A+1)(A— 5), so the eigenvalues are A = — 1 and A = 5. The vector x = (2, — 1) T 
is an eigenvector for A = — 1 since Ax = (—2, 1) T = —x. The vector x = (1, 1) T is an 
eigenvector for A = 5 since Ax = (5, 5) T = 5x. 

2. For the matrix in Example 1, 

s) -(5 s) -(s s). 

as required by Fact 7. 

/ 3 0 2 \ 

3. The characteristic polynomial of the matrix 4=14 1 4 can be calculated by 


using Facts 1-4. Since tr4 = 7, det4 = 5, and Sz(A) = 3~*"2 3~*"4 

it follows that pa{ A) = A 3 — 7A 2 + 11A — 5. Thus pa{ 5) = 0, showing that A = 5 is 
an eigenvalue of A. An eigenvector corresponding to A = 5 is x = (1, 2, 1) T since 
Ax = (5, 10, 5) t = 5x. 
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4. The matrix A in Example 3 is nonsingular since pa( 0) 0, so 0 is not an eigenvalue 

of A (Fact 9). The inverse of A can be calculated using the Cayley-Hamilton theorem: 
A 3 - 7 A 2 + 11,4 - 5/ = 0, so 5 I = A 3 - 7 A 2 + 11 A = A(A 2 - 7 A + 11 1) and I = 

3 0-2' 

A[|(A 2 — 7 A + 11/)]. Consequently, A -1 = |( A 2 — 7 A + 11/) = 



6.5.2 EIGENVECTORS AND DIAGONALIZATION 

Definitions: 

Let A be an eigenvalue of the n x n (complex) matrix A. The algebraic multiplicity 
of A is its multiplicity as a root of the characteristic polynomial. 

The eigenspace of A corresponding to A is the vector space { x £ C n \ Ax = Ax } . 

The geometric multiplicity of A is the dimension of the eigenspace of A corresponding 
to A. 

The square matrix A is diagonalizable if there exists a nonsingular matrix P such 
that P~ 1 AP is a diagonal matrix. 

The minimal polynomial of the square matrix A is the monic polynomial q(-) of 
minimum degree such that q{A) = 0. 

The square matrix A is normal if AA* = A* A. 

The singular values of an n x n matrix A are the (positive) square roots of the 
eigenvalues of AA*, written cri(A) < cr 2 (A) < • • • < <r n (A). 

A row stochastic matrix is a matrix with all entries nonnegative and row sums 1. 

Facts: 

1. The eigenspace corresponding to A is a subspace of the vector space C n . Specifically, 
it is the null space (§6.1.2) of the matrix A — XI. 

2. Eigenvectors corresponding to distinct eigenvalues are linearly independent. 

3. If A,/i are distinct eigenvalues of A and if Ax = Xx and A*y = gy, then x,y are 
orthogonal. 

4. The algebraic multiplicity is never less than the geometric multiplicity, but some- 
times it is greater. (See Example 3.) 

5. The minimal polynomial is unique. 

6. If A can be diagonalized to a diagonal matrix D, then the eigenvalues of A appear 
along the diagonal of D. 

7. The following conditions are equivalent for an n x n matrix A: 

• A is diagonalizable; 

• A has n linearly independent eigenvectors; 

• the minimal polynomial of A has distinct linear factors; 

• the algebraic multiplicity of each eigenvalue of A equals its geometric multiplicity. 

8. If the n x n matrix A has n distinct eigenvalues then A is diagonalizable. 

9. If the n x n matrix A has n linearly independent eigenvectors v\, u 2 , . . • , v n then A 
is diagonalizable using the matrix P whose columns are the vectors v\, v 2 , . . . , v n . 
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10. Hermitian, skew-Hermitian and unitary matrices are normal matrices. 

11. Spectral theorem for normal matrices: If A is an n x n normal matrix, then it 
can be diagonalized by a unitary matrix. That is, there exists an n x n unitary matrix 
U such that U*AU = diag(Ai, A 2 , . . . , A„), the diagonal matrix with the eigenvalues 
Ai, A 2 , • • • , A„ of A along its diagonal. 

12. If A is normal, then it has a spectral decomposition A = V' . . A, //, //; . where 
{«i, U 2 , • • • , u n } is an orthonormal basis for C n . 

13. Diagonalization results for special types of normal matrices are given in the fol- 
lowing table: 


matrix A 

eigenvalues 

diagonalization result 

Hermitian 

real 

Fact 11 

real symmetric 

real 

there exists a real orthogonal P such 
that P T AP is diagonal 

skew-Hermitian 

purely imaginary 

Fact 11 

real skew-symmetric 

purely imaginary 

there exists a real orthogonal Q such 
that Q t AQ is a direct sum of mat- 
rices, each of which is a 2 x 2 real 
skew-symmetric or null matrix 

unitary 

all with modulus 1 

Fact 11 


14. If A, B are normal and commute, they can be simultaneously diagonalized. Namely, 
there exists a unitary U such that U* AU and U*BU are both diagonal. 

15. For any square matrix A, the rank of A is never less than the number of nonzero 
eigenvalues (counting multiplicities) of A. 

16. If A is normal then its rank equals the number of nonzero eigenvalues. 

17. Schur's triangularization theorem: If A is a square matrix, then there exists a 
unitary U such that U* AU is upper triangular with the eigenvalues of A on its diagonal. 


18. If A, B are square matrices that commute then there exists a unitary U such that 
U* AU and U*BU are both upper triangular. 


19. Jordan canonical form: Let A be an n x n matrix with distinct eigenvalues 
Ai, A 2 , . . . , A* having (algebraic) multiplicities r\,r 2 , ■ ■ ■ ,ru respectively. Then there 
exists a nonsingular matrix P such that P~ 1 AP = diag(Ai, A 2 , . . . , A&), where 



A* = 


° 

V 0 


* 0 

A i * 

0 0 
0 0 


0 0 \ 
0 0 I 


A i * I 

0 A J 


is an Vi x r* matrix and each * is either 0 or 1. Furthermore, the number of Is is r, 
minus the geometric multiplicity of A;. 


20. The rank of a square matrix equals the number of nonzero singular values. 


21. If A is a square matrix and if U and V are unitary, then A and UAV have the 
same singular values. 
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22. Singular value decomposition : If A is an n x n matrix then there exist n x n 
unitary matrices U, V such that UAV is diagonal with 01 (A), 02 (A ), . . . , o n (A) on the 
diagonal. 

23. QR factorization: If A is an n x n matrix then there exists a unitary matrix Q 
and an upper triangular matrix R such that A = QR. 

24. The QR factorization of a matrix can be calculated using Gram-Schmidt orthogo- 
nalization (§6.1.4). 


Examples: 

1. Let x,y be vectors of size n x 1 and let A = xy T . Then the eigenvalues of A are 
given by (see §6.5.1, Table 1) y T x and 0, the latter with multiplicity n — 1. 

2. The matrix of §6.5.1 Example 3 has the characteristic polynomial paW = A 3 — 
7A 2 + 11A — 5 = (A — 1) 2 (A — 5). The eigenvalues are A = 1 with algebraic multiplicity 2 
and A = 5 with algebraic multiplicity 1. 

/ 2 0 2 \ 

For A = 1 the eigenspace is the null space of A — XI = I 4 0 4 . It consists 

\2 0 2 / 

of all vectors of the form (a, b , — a) T and so is spanned by the linearly independent 
eigenvectors (1,0, —1) T and (0,1, 0) T . Thus the geometric multiplicity of A = 1 is 2, 
the same as its algebraic multiplicity 1. 

The eigenvalue A = 5 has the eigenvector (1, 2, 1) T (see §6.5.1 Example 3), linearly 

1 0 l\ 

0 1 2 I is the 

-10 1 / 

matrix containing these eigenvectors then P~ 1 AP = diag(l, 1,5), thereby diagonalizing 
A (Fact 9). 


independent of the previous two eigenvectors (Fact 2). If P = 


3. By using Maple, the characteristic polynomial of the matrix 


is found to be A 5 
the eigenvalue A 
null space of 


A = 


2 4 0 3\ 

6 0 0 0 

0 -2 4 0 0 

3 2 4 4 3 

,3 0 2 0 7/ 

28A 4 + 300A 3 - 1552A 2 + 3904A - 3840 = (A - 4) 3 (A - 6)(A - 10), so 
4 of A has algebraic multiplicity 3. The eigenspace for A = 4 is the 


( 7 

0 


/ 3 

2 

4 

0 

3 \ 

0 

2 

0 

0 

0 

0 

-2 

0 

0 

0 

3 

2 

4 

0 

3 

\3 

0 

2 

0 

3/ 


which is spanned by (1,0,0, —1) T and (0, 0, 0, 1, 0) T . So A = 4 has geometric mul- 
tiplicity 2. By Fact 7, A is not diagonalizable. The minimal polynomial of A is 
(A — 4) 2 (A — 6)(A — 10), which has the repeated linear factor A — 4. 


4. The conclusion of Fact 16 need not hold if A is not normal. For example, the matrix 


0 

0 


1 

0 


has rank 1 but has no nonzero eigenvalues. 
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5. Matrix powers: The matrix A = 


is a row stochastic matrix and the 


powers A n of such matrices are important in the analysis of Markov chains (§7.7). The 
eigenvalues of A are A = 1 and A = — with corresponding eigenvectors (1,1) T and 


Consequently 


(2, — 3) t . Thus P 1 AP = D = diag(l,— |), where P = 

A = PDF A 2 = PDP~ 1 PDP - 1 = PD 2 P~ \ and in general A n = PD n P~ 1 . 
Since D n = diag(l n , (— |) n ) = diag(l,a n ), the nth power of A can be computed as 

A n = I (l + l a l l T l a l V Since |a| < 1, A n - 


5 1 3 - 3a n 2 + 3a" 


as n 


6.5.3 LOCALIZATION 

Since analytic computation of eigenvalues can be complicated, there are several simple 
methods available for (geometrically) estimating the eigenvalues of a matrix. These 
methods can be informative in cases when only the approximate location of eigenvalues 
is needed. 

Definitions: 

The spectral radius of A, p(A), is the maximum modulus of an eigenvalue of A. 

Let A be an n x n matrix and let a.; = )C |a©, i = 1, 2, . . . ,n. 

The Gersgorin discs associated with A are the discs 

{z G C | \z - au\ < on. }, i = 1,2,... , n. 

The ovals of Cassini associated with A are the ellipses 

{ z € C | | z an 1 1 z djj I — OiOj i j. 

A strictly diagonally dominant matrix is a square matrix A satisfying \au\ > a, for 
i = 1, 2, . . . , n. 

Facts: 

1. p(A) is the radius of the smallest disc, centered at the origin of the complex plane, 
enclosing all of the eigenvalues of A. 

2. p(A) < min {max V- |aij|,max^T 

i J j 

3. The spectral radius of a row stochastic matrix is 1. 

4. All the eigenvalues of A are contained in the union of the associated Gersgorin discs. 

5. A connected region formed by precisely k < n Gersgorin discs contains exactly k 
eigenvalues of A. 

6. All the eigenvalues of A are contained in the union of the n ^ n ~ 1 ' 1 ovals of Cassini 
associated with A. 
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Examples: 

1. By Fact 2, the spectral radius of the symmetric matrix 



V 1 -1 1 8/ 


is bounded by the maximum absolute row (column) sum 13. Since the eigenvalues of 
a real symmetric matrix are real, the spectral radius bound gives the interval [—13, 13] 
enclosing all eigenvalues. The Gersgorin discs are the intervals 

8 ±4= [4,12], — 8 ± 5 = [-13,-3], 7 ±4= [3,11], and 8 ± 3 = [5,11], 

The second interval is disjoint from the others, so one eigenvalue is localized in the 
interval [—13, —3] while the other three are in the interval [3, 12]. The actual eigenvalues 
of A are (approximately) —8.51,6.31,7.03,10.2, consistent with the above intervals. 
Also, 0 is not in any of the four Gersgorin discs so 0 is not an eigenvalue and A is 
nonsingular. Since the eigenvalues of A are the reciprocals of the eigenvalues of 
A (Table 1, §6.5.1), it follows that the eigenvalues of the symmetric matrix A~ x are 
localized to the intervals [— §>— yg] and [^, |], 

2 1 l\ 

0 6 2 are located in the 

1-1 8 / 

union of the discs 

D l = { z | \z - 2| < 2 }, D 2 = { z | \z - 6 | < 2 }, D 3 = { z \ \z - 8 | < 2 }. 

Since A and A T have the same eigenvalues, an alternative set of disks can be formed 
based on the absolute column sums of A: namely 

D x = { z | \z - 2| < 1 }, D 2 = { z | \z - 6| < 2 }, D 3 = { z \ \z - 8 | < 3 }. 

Here D\ is disjoint from both D 2 and D 3 , and so one eigenvalue of A is localized to D 1 , 

and the other two to £>2 U D 3 . In fact, the eigenvalues of A are 2.24 and 6.88 ± 0.91?', 
approximately. 

3. The row stochastic matrix A = ^ | ^ ^ has Gersgorin discs 

D i = l z I \z~ || < \ } and °2 = {z \ \z- \\ < | }. 

Since D\ C U 2 all eigenvalues must lie in D- 2 - As seen in §6.5.2 Example 5, the 
eigenvalues of A are 1 and — | . 

4. Suppose A is strictly diagonally dominant. Then all Gersgorin discs for A reside 
in the positive right-half plane so all the eigenvalues must have positive real part. In 
particular, 0 is not an eigenvalue and A must be nonsingular. 

5. If the nxn matrix A satisfies aacijj > a^aj for all i 7 ^ j then A must be nonsingular, 
since by Fact 6 zero is not an eigenvalue of A. The matrix of Example 2 satisfies this 
condition since aacijj > 12 > 4 = c^ay for all i ^ j, and so it must be nonsingular. 


2. Using Fact 4, the eigenvalues of the matrix A = 


6.5.4 COMPUTATION OF EIGENVALUES 

The eigenvalues of a matrix can be obtained, in theory, by forming the characteristic 
equation and finding its roots. Since this is not a practical solution method for problems 
of realistic size, a variety of iterative techniques have been developed. 
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Algorithm 1: Power method. 

input: n x n nonsingular matrix A 

output: approximations Xk to an eigenvector of A 

{Initialization} 

choose any vector xo € C n with ||a;o|| = 1 

{Iterative step} 
for k := 1 to . . . do 

_ Axk-i 
Xk WAx^W 


Algorithm 2: QR method. 

input: n x n matrix A 
output: n x n matrices Ak 

{Initialization} 

A := QoRo (a QR factorization of A) 
{Iterative step} 

for k := 1 to ... do 

Ak . Rk—iQk—i 

obtain a QR factorization Ak = QkRk 


Definitions: 

A dominant eigenvalue of a matrix is an eigenvalue with the maximum modulus. 

Let U(9;i,j) be the nx n matrix obtained by replacing the 2x2 principal submatrix 
of the identity matrix, corresponding to rows i and j, with the rotation matrix 
/ cos 9 sin 9 \ 

Y — sin 9 cos 9 ) 


Facts: 

1. Power method : The power method (Algorithm 1) is a simple technique for finding 
the dominant eigenvalue and an associated eigenvector of a nonsingular matrix A having 
a unique dominant eigenvalue. 

A k x 0 

2. In Algorithm 1, the kth estimate Xk = t-t, , ■ 

II^A’oll 

3. The sequence Xk converges to an eigenvector of A. 

4. The sequence ||Axfc|| approaches the dominant eigenvalue. 

5. The power method is best suited for large sparse matrices. 

6. The rate of convergence of the power method is dictated by the ratio of the largest 
to the second largest (in modulus) eigenvalue of A. The larger this ratio (the more 
separated these two eigenvalues in modulus), the faster the convergence of the method. 

7. QR method: This method (Algorithm 2) calculates the eigenvalues of a given nxn 
matrix A. 
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Algorithm 3: Jacobi method. 


input: n x n real symmetric matrix A 
output: n x n matrices A k 

{Initialization} 

A i = {“if) '■= A 
{Iterative step} 

for k := 1 to . . . do 


choose r,s (r < s) with |a^| as large as possible 


define 6 by cot 2d = 


(k) Jk) 

CLrr O'ss 


2 a 


(k) 


A k+ 1 = (a^ +1) ) := U(0 ; r, s) T A k U(6 ; r, s) 


8. The QR factorization in Algorithm 2 produces a unitary matrix Q k and an upper 
triangular matrix R k . (See §6.5.2 Fact 23.) 

9. Under certain conditions (for example, if the eigenvalues of A have distinct moduli) 
the sequence A k in Algorithm 2 converges to an upper triangular matrix whose diagonal 
entries are the eigenvalues of A. 

10. If A is real then its QR factors are real and can be calculated using real arithmetic. 
In this case, if A has nonreal eigenvalues then under certain conditions, the limiting 
matrix is block triangular with lxl and 2x2 diagonal blocks. 

11. The QR method is not well suited for large sparse matrices since the factors Q , R 
can quickly fill with nonzeros. 

12. Often as a preparatory step for the QR method the matrix is first reduced to 
Hessenberg form (upper triangular form in which there may be one nonzero diagonal 
below the main diagonal) by using Householder transformations [Da95]. 

13. The convergence of the QR method can be very slow if the matrix has two eigen- 
values that are close in moduli. 

14. More effective versions of the QR method are available which make use of certain 
shift strategies [GoVa96] . 

15. Jacobi method : This method (Algorithm 3) finds the eigenvalues of a real sym- 
metric n x n matrix A having at least one nonzero off-diagonal entry. 

16. The sequence A k in Algorithm 3 converges to a real diagonal matrix with the 
eigenvalues of A on the diagonal. 

17. The orthogonal matrix U{9\r,s) represents a (clockwise) plane rotation by the 
angle 6. 

18. The Jacobi method is particularly appropriate when A is nearly diagonal, although 
in general the QR method exhibits faster convergence. 

19. A variant of the Jacobi method, the serial Jacobi method, uses plane rotation pairs 

cyclically — for example, (1, 2), (1, 3), ... , (1, n), (2, 3), . . . , (2, n), 

20. For further information on numerical computation of eigenvalues, see [GoVa96, 
Da95]. 
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21. A number of robust and well-tested software packages are available for carrying 
out eigensystem analysis, including: 

• EISPACK : a collection of Fortran routines for analyzing eigenvalues and eigen- 

vectors of several classes of matrices; see 
http : // www . netlib . org 

• LAPACK/CLAPACK: supersedes EISPACK , contains Fortran and C routines 

for dense and banded problems, ideal for shared-memory vector and parallel 

processors; see 

http : // www . netlib . org 

• NAG : Fortran and C libraries for eigenanalysis of dense and sparse matrices; see 

http : // www . nag . com 

• IMSL : contains Fortran and C libraries for eigenanalysis of dense and banded 

problems; see 

http : //www. vni . com/products/ imsl 

• MATLAB : high-level language for eigenanalysis of dense and sparse matrices, 

calculation of characteristic polynomials; see 
http : // www . mathworks . com 


Examples: 

1. The power method, when applied to the matrix in Example 3 of §6.5.1, produces 
the following sequence of vectors Xk and scalars ||Ar fc ||: 


k 


Xk 


0 

1 

0 

0 


1 2 

/ 0.557 \ / 0.436 \ 

0.743 0.805 

\ 0.371/ \ 0.403/ 


3 4 5 

/ 0.414 \ / 0.409 \ / 0.409 \ 

0.814 0.816 0.816 

V 0.407/ \ 0.408/ V 0.408/ 


\\ A Xk || 


5.385 


5.537 


5.107 


5.021 


5.004 5.001 


The scalars ||A:rfc|| approach the dominant eigenvalue 5 and the vectors Xk approach a 
multiple of the eigenvector (1,2,1) T . (See Example 3, §6.5.1.) 


2. The eigenvalues of the matrix A = 


method. A = QqRq with Qq = 


1 4 

2 3 

-0.447 -0.894 
-0.894 0.447 


can be approximated using the QR 

. n / -2.236 -4.472 

and Aq - ( Q —2.236 


Then A 1 = RqQo = 
A 2 = 


5 0 

2 -1 


. Continuing this process produces 


4.862 2.345 

0.345 -0.862 


Aq = 


_ / 5.023 —1.927 \ 
0.073 -1.023 / ’ 


A 4 = 


4.995 2.014 

0.014 -0.995 


The sequence Ak approaches an upper triangular matrix with the eigenvalues 5 and —1 
on its diagonal. 

/0 1 1\ 

3. The eigenvalues of the matrix A = 1 4 —3 can be approximated using the 

V 1 -3 V 

Jacobi method. The largest off-diagonal |a rs | of A\ = A occurs for r = 2, s = 3 giving 

( 1 0 0 \ 

9 = j = 0.7854. Applying the matrix U(9; 2,3) = I 0 0.7071 0.7071 I produces 

\ 0 -0.7071 0.7071 / 
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A 2 = U(9; 2, 3) t A 1 U(0; 2, 3) = 


0 0 1.4142 

0 7 0 ) . The largest magnitude off- 

1.4142 0 1 

diagonal entry of A 2 is |oi 3 |, giving 9 = 0.6155, U(9; 1,3) = 

'-1 0 0 \ 

and A 3 = U(9; 1, 3) T A 2 U{9\ 1, 3) = | 0 7 0 1. So the eigenvalues of A are — 1, 2, 7. 

0 0 2 / 


0.8165 0 0.5774 
0 10 
—0.5774 0 0.8165 


6.5.5 SPECIAL CLASSES 

This section discusses eigenvalues and eigenvectors of specially structured matrices, such 
as Hermitian, positive definite, nonnegative, totally positive, and circulant matrices. 


Definitions: 

If x, y € 1Z n , then x majorizes y if Y^i=i x i = l Vi an d f° r ^ = 1, 2, . . . , n — 1 the 
sum of the k largest components of x is at least as large as the sum of the k largest 
components of y. A similar definition holds for infinite sequences with finitely many 
nonzero terms. 


A Hermitian n x n matrix A is positive definite if x* Ax > 0 for all nonzero x £ C n . 
It is positive semidefinitc if x* Ax > 0 for all x £ C n . 

If A and B are n x n Hermitian matrices then A dominates B in Lowner order if 
A — B is positive semidefinite, written APB. 

A matrix is nonnegative [positive] if each of its entries is nonnegative [positive]. 

The n x n matrix A is reducible if either it is the lxl zero matrix or there exists a 
permutation matrix P such that PAP T is of the form ^ ^ ’ w ^ iere ^ anc ^ ^ are 

square. A matrix is irreducible if it is not reducible. 


A strictly totally positive matrix has all of its minors positive. 


A circulant matrix has the form 




/ a 0 

ai 

02 


a n 

a 0 

ai 


C-n— 1 


a 0 


V ai 

«2 

03 

Notation: Let Ai(A) < A 3 (A) < • • 

- 

-c 

VI 

U) 


matrix A. 


a n \ 

' ' ’ Q"n— 1 
* * * tin— 2 

a 0 / 

be the eigenvalues of an n x n Hermitian 


Facts: 

1. Cauchy interlacing theorem: Let A be an n x n Hermitian matrix and let B be a 
principal submatrix of A of order n— 1. Then 

A,(A) < Xi(B) < A !+ i(A), * = l,2,...,n-l. 
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2. Weyl’s theorem : Let A, B be n x n Hermitian matrices and let j,k be integers 
satisfying 1 < j, k < n. 

• If j k ^ n “h 1, then Aj_|_k_ n (A -t- H) ^ Ay(A) ~\~ A/,.(H); 

• If j k ^ n 1, then A©A) -t- Afc(H) ^ A ^ _j_ ^ x ( P H). 

3. Interpretations of the fcth smallest eigenvalue of a Hermitian matrix are given in the 
following table: 


eigenvalue 

variational characterization 

Xi (A) 

min(x*Aa;), minimum over all unit vectors x 

X n(A) 

max(a:*Ax), maximum over all unit vectors x 

Afc(A), 

min©* A©, minimum over all unit vectors x orthogonal 

£ 

Cd 

II 

-se 

to the eigenspaces of Ai, . . . , Afc_i 

^n-k(A), 

max(:r*A:r), maximum over all unit vectors x orthogonal 

k = 1, . . . , n — 1 

to the eigenspaces of X n _k+i, . . . , X n 


4. Schur’s majorization theorem: If A is an n x n Hermitian matrix, then (Ai(A), A 2 ©), 
. . . , A„(A)) majorizes (an, 022 , ■ • • , a nn ). Specifically, if an > 022 >•■••> a nn then 

k k 

A n _2_|_l(.A) ^ ^22} ^ — 1? 2, ... 5 71. 

2=1 2=1 

5. Hoffman-Wielandt theorem: If A, H are n x n Hermitian matrices, then 



6. Sylvester’s law of Inertia: If A is an n x n Hermitian matrix and if X is a nonsingular 
n x n matrix, then A and X T AX have the same number of positive eigenvalues as well 
as the same number of negative eigenvalues. 

7. A Hermitian matrix is positive definite (positive semidefinite) if and only if all its 
eigenvalues are positive (nonnegative). 

8. If A, B are nx n positive semidefinite matrices and A^ B, then A^(A) > Xi(B), 
i = 1 , 2 , . . . , n. 

9. If A , B are nx n positive semidefinite matrices, then Xi+j- n (AB) < Xi(A)Xj(B ) 
holds for 1 < i, j < n and i + j > n + 1. 

10. If A, B are nx n positive semidefinite matrices, then 

k k 

El Xi(AB) > n A,:(A)A i (H), fc = 1, 2, . . . , n. 

i — 1 i - 1 


11. Kantorovich inequality: If A is an n x n positive definite matrix and if x € C n is 
a unit vector, then 


(x*Ax)(x*A 1 x) < 


(Ai(A) + A »(A)) 2 
4Ai(A) A„(A) 


12. Perron-Frobenius theorem: If A is an irreducible nonnegative square matrix, then 
the spectral radius of A (the Perron root of A) is an eigenvalue of A with algebraic 
multiplicity 1 and it has an associated positive eigenvector. If A is positive then the 
spectral radius exceeds the modulus of any other eigenvalue. 
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13. If A is a nonnegative square matrix, then the spectral radius of A is an eigenvalue 
of A and it has an associated nonnegative eigenvector. 

14 . Let A be an n x n strictly totally positive matrix. Then the eigenvalues of A are dis- 
tinct and positive: Ai(A) < A 2 (A) < ••• < A n (A). The real eigenvector corresponding 
to A n ~k has exactly k variations in sign. 

15 . If A is an nxn strictly totally positive matrix, then (Ai(A), A 2 (A), . . . , A„(A)) 
majorizes (an, a 2 2 , ■ ■ • , a nn ). 

16 . An (n +1) x (n + 1) circulant matrix has eigenvalues A j = ao + a\Oj + a 2 w 2 ' 7 + 
• • • + a n u n i , j = 0,1 ,n with (1, cA, w 2 - 7 , . . . , u+ J ), j = 0, 1, . . . , n the corresponding 

2-tt» 

eigenvectors, where u> = e n + 1 . 


Examples: 


1. The matrix A = 



1 1 

4 —3 | has eigenvalues — 1, 2, and 7 (§6.5.4, Example 3). 


The principal submatrix A [1,2] = 


has eigenvalues 2 ± y/5, which are approxi- 


mately equal to —0.2361 and 4.2361. As required by Fact 1, these latter two eigenvalues 
interlace those of A: —1 < —0.2361 < 2 < 4.2361 < 7. Similarly, the principal subma- 
4 — 3\ 

has eigenvalues 1 and 7, which interlace those of A. 


trix A [2, 3] = 


-3 


2 . The matrix in Example 1 has the eigenvalue sequence (—1,2,7). This sequence 
majorizes (see Fact 4) the sequence (0, 4, 4) of diagonal elements: 7>4, 7 + 2>4 + 4, 
and 7 + 2 — l>4 + 4 + 0. 


3 . The irreducible matrix A in §6.5.3 Example 3 is positive with eigenvalues 1 and — |. 
Thus p(A) = 1 and it exceeds the modulus of any other eigenvalue. As required by 
Fact 12, there is a positive eigenvector associated with A = 1, namely (1, 1) T . 

2 0 3\ 

1 4 5 is nonnegative with eigenvalues —1 (algebraic multi- 

2 0 1 / 

plicity 1) and 4 (algebraic multiplicity 2). So the spectral radius is 4 and (see Fact 13) 
A = 4 must be an eigenvalue. In addition, there is a nonnegative eigenvector associated 
with A = 4, namely (0, 1, 0) T . 


4 . The matrix A = 


6.6 COMBINATORIAL MATRIX THEORY 


Matrices and graphs represent two different ways of viewing certain discrete structures. 
At times a matrix perspective can lend insight into graphical or combinatorial structures. 
At other times the graph associated with a matrix can provide useful information about 
matrix properties. 
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6.6.1 MATRICES OF ZEROS AND ONES 


Definitions: 

A 0-1 matrix is a matrix with each entry either 0 or 1. 

The term rank of a 0-1 matrix is the maximum number of Is such that no two are in 
the same row or column. 

An n x n 0-1 matrix is partly decomposable if it has a k x (n — k) zero submatrix 
for some 1 < k < n — 1; otherwise A is fully indecomposable. 

Let {x n } be a sequence of nonnegative integers with finitely many nonzero terms. The 
conjugate sequence of {x n } is the sequence {z n } in which z n , n > 1, is the number 
of terms in {x n } that are not less than n. 

Facts: 

1. Konig's theorem : The term rank of a 0-1 matrix equals the minimum number of 
rows and columns required to cover all Is in the matrix. 

2. Frobenius-Konig theorem: Let A be an n x n 0-1 matrix. Then the term rank of A 
is less than n if and only if A has a zero submatrix of size r x s with r + s = n + 1. 

3. Let A be an n x n 0-1 matrix each of whose row sums and column sums is k. Then A 
can be expressed as a sum of k permutation matrices (§6.4.3). 

4. Let A be a square 0-1 matrix and let B be the matrix obtained from A by replacing 
each 0 entry on the main diagonal of A by 1. Then A is irreducible (§6.5.5) if and only 
if B is fully indecomposable. 

5. Let A, B be n x n fully indecomposable matrices. Then the matrix obtained by 
replacing every nonzero entry in AB by 1 is fully indecomposable. 

6. Gale-Ryser theorem: Let Xi,X 2 , ■ ■ ■ , x m \ y \ , j/ 2 ,.. . , Un be nonnegative integers and 
let {z n } be the conjugate sequence of x±, X 2 , ■ ■ ■ , x m , 0,0,.... There exists an m x n 0-1 
matrix with row sums xi,x%,. . . , x m and column sums yi, 3/2 > • • ■ , y n if and only if {z n } 
majorizes yi, y 2 , . . . , y n , 0, 0, . . . . 


Examples: 


1. The following matrix contains a 2 x 4 zero submatrix, occurring in rows 1, 3 and 
columns 1, 2, 4, 5. By Fact 2, this means that the matrix must have term rank less 
than 5. In fact, the matrix has term rank 3. Namely, the starred entries represent a 
set of 3 entries, no two of which are in the same row or column, and 3 is the largest 
number with this property. Rows 2,4 and column 3 cover all the Is in the matrix, and 
no smaller number suffices, as guaranteed by Fact 1. 


2. The matrix 


/° 

0 

1* 

0 

0 

\ 

1* 

1 

1 

0 

1 


0 

0 

1 

0 

0 


1 

0 

1 

1 

1* 


Vo 

0 

1 

0 

0 

J 


/° 

1 1 

!\ 




1 

0 1 

1 




1 

1 0 

1 




Vi 

1 1 

0/ 
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has all row and column sums equal to 3. By Fact 3, it can be expressed as the sum of 
three permutation matrices. For example 


/° 

0 

1 

°\ 


0 

0 


(° 

1 

0 

°\ 

0 

0 

0 

1 , 

0 

0 

1 

0 + 

1 

0 

0 

0 

1 

0 

0 

0 + 

0 

1 

0 

0 

0 

0 

0 

1 

Vo 

1 

0 

0/ 

\1 

0 

0 

0/ 

Vo 

0 

1 

0/ 


3 . Assignment problem: There are n applicants for n vacant jobs. Each applicant is 
qualified for exactly k > 1 jobs and for each job there are exactly k qualified applicants. 
Is it possible to assign each applicant to a (distinct) job for which the applicant is 
qualified? To answer this question form the 0-1 matrix A where = 1 if applicant i 
is qualified for job j, otherwise a, ? = 0. All row and column sums of A equal k, so (by 
Fact 3) A can be expressed as the sum of k > 1 permutation matrices. Select any one 
of these permutation matrices and use it to define an assignment of applicants to jobs. 
Thus it is possible in this case to fill each job with a different qualified applicant. 

4. In the matrix 

/0 1 1 0 1 1 1 1 \ 

10110110 
00110101 
10101101 
Vo 1 0 1 1 1 1 0/ 

the conjugate sequence of the row sum sequence 6, 5, 4, 5, 5 is 5, 5, 5, 5, 4, 1,0,0,... and 
it majorizes the sequence 2, 2, 4, 3, 3, 5, 3, 3, 0, 0, . . . obtained by appending zeros to the 
sequence of column sums (Fact 6). 


6.6.2 NONNEGATIVE MATRICES 

This subsection discusses nonnegative matrices and special classes of nonnegative ma- 
trices such as primitive and doubly stochastic matrices. Certain results highlight the 
relationship between the Perron root (§6.5.5, Fact 12) and the directed graph of a ma- 
trix. 

Definitions: 

The directed graph G(A) of an n x n matrix A consists of n vertices 1,2 ,n with 
an edge from i to j if and only if a,y ^ 0. 

Vertices i and j of G(A) are equivalent if i = j or if there is a path in G(A) from i to j 
and a path from j to i. The corresponding equivalence classes (§1.4.2) of this relation 
are the classes of A. 

Class Ci has access to class Cj if i = j or if there is a path in G(A) from a vertex in C, 
to a vertex in Cj. A class is final if it has access to no other class. Class Ci is basic 
if p{A[Ci}) = p{A), where p(-) is the spectral radius (§6.5.3) and A[Ci] is the principal 
submatrix of A defined by indices in class Ci. 

Let A be an n x n nonnegative irreducible matrix. The number h of eigenvalues of A 
of modulus p(A) is called the index of cyclicity of A. The matrix A is primitive if 
h= 1. 
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The exponent of A , written exp(A), is the least positive integer m with A m > 0. 

A square matrix is doubly stochastic if it is nonnegative and all row and column sums 
are 1. 

If A is an n x n matrix and cr € S n , the symmetric group on n elements (§5.3.1), then 
the set {ai CT (i), a 2 cr( 2 ) • • • , a ncr ( n)} is the diagonal of A corresponding to a. 

A diagonal of A is positive if each entry in it is positive. 

Matrices A and B of the same size have the same pattern if the following condition 
holds: Oij = 0 if and only if bij = 0. 

A matrix A has doubly stochastic pattern if there exists a doubly stochastic matrix B 
such that A and B have the same pattern. 


Facts: 


1. The matrix A is irreducible if and only if G(A) is strongly connected (§8.3.2). 


2 . Frobenius normal form : If the n x n matrix A has k classes, then there exists a 
permutation matrix P such that 


PAP T = 


(A n 

I A 2 1 


0 

A22 


0 \ 
0 


\A kl Ak2 


Akk 


J 


where each An, 1 < i < k, is either irreducible or a 1 x 1 zero matrix. 


3. The classes of a nonnegative n x n matrix A are in one-to-one correspondence with 
the strong components (§8.3.2) of G(A) and hence can be found in linear time. 

4. Let A be an n x n nonnegative matrix. There is a positive eigenvector corresponding 
to p(A) if and only if the basic classes of A are the same as its final classes. 

5. Let A be an n x n nonnegative matrix with eigenvalue A. There exists a nonnegative 
eigenvector for A if and only if there exists a class C) satisfying both of the following: 

. p(A[Ci]) = A; 

• if Cj ( j i) is any class that has access to C§, then p{A[Cj}) < p{A[Ci\). 

6. The n x n nonnegative matrix A is primitive if and only if exp(A) < oo. 

7. A nonnegative irreducible matrix with positive trace is primitive. 

8. Suppose A is an n x n nonnegative irreducible matrix. Let Si be the set of all the 
lengths of cycles in G(A) passing through vertex i, and let hi be the greatest common 
divisor of all the elements of «S). Then hi = h 2 = ■ ■ ■ = h n and this common value 
equals the index of cyclicity of A. 

9. Let A be a nonnegative irreducible n x n matrix with p > 1 nonzero elements on 
the main diagonal. Then A is primitive and exp(A) < 2n — p — 1. 

10 . Let A be a primitive n x n matrix, and let s be the smallest length of a directed 
cycle in G(A). Then exp(A) < n + s(n — 2). 

11 . Let A be an n x n primitive 0-1 matrix, n > 2. Then exp(A) < (n — l) 2 + 1. 
Equality holds if and only if there exists a permutation matrix P such that 
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/° 

0 


1 

0 


PAP T = 


i: 


0 ••• 0 \ 

1 ••• 0 


0 0-1 

0 0 ••• 0 / 


12 . The set fi n of n x n doubly stochastic matrices is a compact convex set. 

13 . Birkhoff -von Neumann theorem: Every A €. can be expressed as a convex 

combination of n x n permutation matrices: namely, A = C\P\ + C2P2 + • • • + CtPt for 
some permutation matrices Pi, P2, . . . , Pt and some positive real numbers C\,C2, ■ ■ ■ ,Ct 
with ci + c 2 + 1- Ct = 1. 


14 . The following conditions are equivalent for an n x n matrix A: 

• A has doubly stochastic pattern; 

• there exist permutation matrices P, Q such that PAQ is a direct sum of fully 

indecomposable matrices; 

• every nonzero entry of A is contained in a positive diagonal. 


15 . Let A be an n x n nonnegative idempotent matrix of rank k. Then there exists a 
permutation matrix P such that 


PAP T 



JU 

0 

°\ 

0 

0 

0 

VJU 

0 

0 

0 

0 

0) 


where J is a direct sum of k positive idempotent matrices of rank 1. 

16 . A nonnegative symmetric matrix A of rank k is idempotent if and only if there 
exists a permutation matrix P such that 

PApT =(l 0 )' 

where J is a direct sum of k positive symmetric rank one idempotent matrices. 


Examples: 

1. The following nonnegative matrix is in Frobenius normal form 


5 

0 

0 

0 

0 

0 

O 

(T 

1 

1 

0 

0 

0 

0 

0 

2 

0 

0 

0 

0 

0 

2 

4 

1 

0 

3 

0 

0 

1 

2 

0 

2 

1 

0 

0 

(T 

1 

1 

0 

0 

3 

2 

0 

2 

1 

0 

0 

2 

3 


with four classes C\ = {1}, C2 = {2,3}, C3 = {4,5} and C4 = {6,7}. Class C3 has 
access to C\ and C 2 while class C4 has access to C 2 - Classes C\ and C 2 are final since 
they have access to no other classes. The eigenvalues of A are —2, —1, 1, 2, 3, 5, 5 so 
p(A) = 5. Classes C\ and C4 are basic since p{A[Ci\) = p{A[C4\) = 5. Since no class 
has access to C3, Fact 5 shows there is a nonnegative eigenvector of A for the eigenvalue 
p{A[Cz}) = 3, namely (0, 0, 0, 1, 1, 0, 0) T . However there is no nonnegative eigenvector 
of A for p{A[C2\) = 2 since class C3 has access to class C2 and p{A[C^\) > p{A[C2})- 
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2 . 


The directed graph of the matrix 


A = 


/0 1 7 0 0\ 

0 0 0 0 1 

0 0 0 1 0 

1 0 0 0 0 

Vl 0 0 0 0/ 


is the union of the cycles 1,2, 5,1 and 1,3,4, 1. The greatest common divisor of the 
lengths of all cycles passing through vertex 1 is 3, which by Fact 8 must be the index of 
cyclicity. In fact, A has eigenvalues 2, —1 ± iy/ 3, 0, 0 and thus there are 3 eigenvalues 
with modulus p(A ) = 2 . 


3 . By Fact 13 every doubly stochastic matrix is a convex combination of permutation 
matrices. For example, the doubly stochastic matrix 


(A .3 .3 0\ 

.5 0 .4 .11 

.1 .6 0 .3 I 

\0 .1 .3 . 6 / 

can be expressed as 


/° 

0 

1 

°\ 

/° 

1 

0 

°\ 

1 1 

0 

0 

0 | , 3 

1 

0 

0 

0 1 

0 

1 

0 

0 +.3 

0 

0 

0 

1 

Vo 

0 

0 

1/ 

Vo 

0 

1 

0 / 


/° 

0 

1 

°\ 

/l 

0 

0 

°\ 

1 0 

0 

0 

1 | , 4 

0 

0 

1 

0 1 

1 

0 

0 

o + - 4 

0 

1 

0 

0 

Vo 

1 

0 

0 / 

Vo 

0 

0 

1 / 


6.6.3 PERMANENTS 

The permanent of a matrix is defined as a sum of terms, each corresponding to the 
product of elements along a diagonal of the matrix. Permanents arise in the study of 
systems of distinct representatives and in other combinatorial problems. 

Definition: 

The permanent of the n x n matrix A is per(A) = YaeS n a iu(i) a 2 a( 2 ) ■ ■ • GW(n)> 
where S n is the symmetric group on n elements. (See §5.3.1.) 

Facts: 

1. The permanent of A is an unsigned version of the determinant of A. (See §6.3.4 
Fact 2.) 

2 . Computing the permanent of a square 0-1 matrix is ^P-complete. 

3 . Laplace expansion: Suppose A is an n x n matrix and Aij is the submatrix of A 
obtained by deleting the itli row and the jtli column. Then for i = 1, 2, . . . , n 

n 

per(A) = Yj a P per (A©. 

3 = i 

A similar expansion holds with respect to any column. 

4 . per(A T ) = per(A). 

5 . Interchanging two rows (or two columns) of A does not change per(A). 

6 . Multiplying any row (or column) of A by the scalar a multiplies per(A) by a. 
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7. Unlike the determinant, the permanent is not multiplicative with respect to matrix 
multiplication. (See Example 2.) 

8. The permanent of a triangular matrix (§6.3.1) is equal to the product of its diagonal 
entries. 

9. The permanent of a block diagonal matrix is equal to the product of the permanents 
of its diagonal blocks. 

10. For each positive integer n the permanent of the n x n matrix 


(0 

i 

• i\ 

1 

• o 

■ • i 

Vi 

i 

o • • 


n 

is n! (— l) r Jy and it represents the number of derangements (§2.4.2) of order n. 

r—0 

11. The permanent of an n x n 0-1 matrix A counts the number of assignments (nx n 
permutation submatrices) consistent with the 1 entries of A. 

12. Minc-Bregman inequality : Let A be an n x n 0-1 matrix with row sums rq, r 2 , . . . , 
r n . Then 

n 

per(A) < (uD 1/ri - 

13. If A is a nonnegative n x n matrix with row sums rq , r 2 , . . . , r n then per(A) < 
nr 2 ...r n . 

14. Let A be a fully indecomposable nonnegative integral n x n matrix and let s(A) 
denote the sum of the entries in A. Then 

s(A) - 2n + 2 < per(A) < 2 s ( A )" 2n + 1. 

15. Alexandroff inequality: Let A be a nonnegative n x n matrix and let Ai be the 
ith column of A, i = 1, 2, . . . , n. Then 

(per(A)) 2 > per(Ai, . . . , A„_ 2 , A„_i, A „_i) per(Ai, . . . , A„_ 2 , A n , A n ). 

16. The definition of the permanent can be extended to to x n matrices with m < n 
by summing over all permutations in S m . 

17. If A is an in x n 0-1 matrix, then per(A) > 0 if and only if A has term rank m. 

18. van der Waerden-Egorychev-Falikman inequality: If A is a doubly stochastic nxn 
matrix then per( 2 l) > n ' t , and equality holds if and only if A = J n , the matrix with 
each entry 1 . 

Note: This result was first conjectured by B. L. van der Waerden in 1926. Despite re- 
peated attempts to prove it, the conjecture remained unresolved until finally estab- 
lished in 1980 by G. P. Egorychev. The conjecture was also proved independently by 
D.I. Falikman in 1981, apart from establishing the uniqueness of the minimizing ma- 
trix A. A self-contained exposition of Egorychev’s proof is given in [Kn81]. 

19. Let A be the m x n incidence matrix of in subsets of a given n-set X: namely, 
a.y = 1 if j e Xi and = 0 otherwise. Then per(A) counts the number of SDRs 
(systems of distinct representatives, §1.2.2) selected from the sets X 1: X 2 , . . . ,X m . 
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Examples: 

1. For the matrix 


A = 



evaluation of per(A) by the definition gives per(A) = 1-1-2 + 2-3-5 + 2-1-1 + 1-1-5 = 39. 


Using the Laplace expansion on row 1 gives per(A) = 1-per 
1-7 + 2 -16 = 39. 


1 1 
5 2 


+ 2 -per 


3 1 
1 5 


2. If A = 


1 1 
0 1 


and B = 


1 0 
1 1 


, then C = AB 


2 1 
1 1 


Notice that 


per {AB) =3^1-1 = per(A) per(H). 


3. Assignments: Suppose there are 4 applicants for 4 jobs, where the qualifications of 
each applicant i for each job j is specified by the 0-1 matrix 


/0 1 
1 1 
0 0 
Vi i 


0 

0 

1 

1 



Then the number of different assignments of jobs to qualified applicants (see §6.6.1, 
Example 3) equals per(A) = 4. In fact, these are given by those permutations a where 
{(a(l), a(2), (7(3), cr(4))} = {(2, 1,4, 3), (2,4, 3, 1), (4, 1, 3, 2), (4, 2, 3, 1)}. 


4. Menage problem: Suppose that 5 wives are seated around a circular table, leaving 
one vacant space between consecutive women. Find the number of ways to seat in these 
vacant spots their 5 husbands so that no man is seated next to his wife. Suppose that 
the wives occupy positions W\ , W? , . . . , W 5 listed in a clockwise fashion around the table 
and that Xi is the vacant position to the right of W t . Let A be the 5x5 0-1 matrix 
where a, ? = 1 if and only if husband Hi can be assigned to position X,; without violating 


the requirements of the problem. Then 






(° 

0 

1 

1 

1 \ 


1 

0 

0 

1 

1 

A = 

1 

1 

0 

0 

1 


1 

1 

1 

0 

0 


Vo 

1 

1 

1 

0/ 


By Fact 11, the number of possible assignments for each fixed placement of wives is 
per(A) = 13. (Also see §2.4.2, Example 7.) 


5. Count the number of nontaking rooks on a chessboard with restricted positions 
(§2.4.2). Specifically, suppose that positions (1, 1), (2, 3), (3, 1), (4, 2), (4, 3) of a 4 x 4 
chessboard cannot be occupied by rooks. In the remaining positions, 4 rooks are to be 
placed so they are nontaking: no two are in the same row or in the same column. This 
can be solved (see Fact 11) by finding all permutations consistent with the Is in the 
matrix 

/0 1 1 1 

A= 1 1 0 1 
1 0 1 1 1 

Vl 0 0 1 

Here per(A) = 6 is easily found using the Laplace expansion on the first column of A, 
so there are 6 placements of nontaking rooks. 
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INTRODUCTION 


This chapter discusses aspects of discrete probability that are relevant to mathematics, 
computer science, engineering, and other disciplines. Topics covered include random 
variables, important discrete probability distributions, random walks, Markov chains, 
and queues. Various applications to genetics, telephone network performance and reli- 
ability, average-case algorithm analysis, and combinatorics are presented. 


GLOSSARY 

absorbing boundary : a boundary that stops the motion of a random walk whose 
trajectory comes into contact with it. 

all-terminal reliability: the probability that a given network is connected. 

antithetic variates: a variance reduction technique, based on negatively correlated 
variates, used in the simulation analysis of a given system. 

aperiodic state: a state of a Markov chain that is not periodic. 

arrival process: the statistical description of the time between successive arrivals to 
a queueing system. 

average-case complexity (of an algorithm): the average number of operations re- 
quired by the algorithm, taken over all problem instances of a given size. 

Bernoulli random variable: the discrete random variable X £ {0, 1} with probabil- 
ity distribution Pr(X = 0) = 1 — p and Pr(X = 1) = p, for some 0 < p < 1. 

binomial random variable : the discrete random variable X £ {0,1,..., n} with 
probability distribution Pr(X = k) = (^)p fe ( 1 — p) n ~ k , for some 0 < p < 1. 

Bose-Einstein model: a probability model in which k indistinguishable balls are 
randomly placed into n distinguishable urns; several balls are allowed to occupy the 
same urn. 

boundary : a point or set of points restricting the trajectory of a random walk. 

branching process: a special type of Markov chain used to model the growth, and 
possible extinction, of populations. 

closed class: a communicating class of states of a Markov chain in which transitions 
from these states never lead to states outside the class. 

coherent system: a system of components for which increasing the number of oper- 
ating components will not degrade the performance of the system. 

common random numbers: a variance reduction technique in which alternative 
system configurations are analyzed using the same set of random numbers. 

communicating class: a maximal set of states in a Markov chain that are reachable 
from one another by a finite number of transitions. 

conditional probability: the probability that one event, A, occurs, given that an- 
other event, B, has occurred, written Pr(A\B). 

cutset: a minimal set of edges in a graph the removal of which disconnects the graph. 

density function: a nonnegative real-valued function f(x ) that determines the distri- 
bution of a continuous random variable X via Pr{a < X < b) = f(x) dx. 

dependent (events): events that are not independent. 
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discrete-event simulation: a simulation of a time-evolving stochastic process in 
which changes to the state of the system can only occur at discrete instants. 

discrete-time Markov chain: a probabilistic model of a randomly evolving system 
whose future is independent of the past if the present state is known. 

distribution (of a random variable): a probability measure associated with the values 
attained by the random variable. 

elastic boundary : a boundary that could be absorbing or reflecting, usually depend- 
ing on some given probability. 

event: a subset of the sample space. 

expected value (of a random variable): the average value taken on by the random 
variable. 

experiment: any physically or mentally conceivable action having a measurable result. 

extinction probability : the probability in a branching process that the population 
eventually dies out. 

Fermi-Dirac model: a probability model in which k indistinguishable balls are ran- 
domly placed into n distinguishable urns; at most one ball can occupy each urn. 

first passage time: the time to first visit a given set of states in a Markov chain. 

floating-point arithmetic: the “real number” arithmetic of computers. 

flop: a unit for floating-point computations that is useful in assessing the complexity 
of an algorithm. 

gambler’s ruin: a one-dimensional random walk in which a gambler wins or loses 
one unit at each play of a game, with the game terminating whenever the gambler 
amasses a known amount or loses his entire initial stake. 

geometric random variable: the discrete random variable X € {1,2,...} with prob- 
ability distribution Pr(X = k) = (1 — p) k ~ 1 p, for some 0 < p < 1. 

hypergeometric random variable: the discrete random variable that counts the 
number of red balls obtained when randomly selecting a fixed number of balls from 
an urn containing a specified number of red and black balls. 

independent events: events in which knowledge of whether one of the events did or 
did not occur does not alter the probability of occurrence of any of the other events. 

independent random variables : random variables whose joint distribution is the 
product of their individual distributions. 

irreducible chain: a Markov chain that can visit any state from any other state in a 
finite number of steps. 

irrelevant edge: an edge of a two-terminal network not appearing on any simple path 
joining the two terminals of the network. 

K-cutset: a minimal set of edges in a graph, the removal of which disconnects some 
pair of vertices in K. 

K-tree: a minimal set of edges in a graph that connects all vertices in K. 

machine unit: a measure of the precision of floating-point arithmetic. 

Maxwell-Boltzmann model: a probability model in which k distinguishable balls 
are randomly placed into n distinguishable urns; several balls can occupy the same 
urn. 
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mincut : a minimal set of components in a coherent system such that the system fails 
whenever these specified components fail. 

minpath: a minimal set of components in a coherent system such that the system 
operates whenever these specified components operate. 

Monte Carlo simulation: a simulation used to study both deterministic and stochas- 
tic phenomenon in which the passage of time is not material. 

overflow : the result of a floating-point arithmetic operation that exceeds the available 
range of numbers. 

parallel system: a system of components that fails only when all components fail. 

periodic state: a state of a Markov chain that can only be revisited at multiples of a 
certain number d > 1 (the period of the state). 

Poisson random variable: the discrete random variable X £ {0, 1, . . .} with proba- 
bility distribution Pr{X = k) = e fc! A , for some A > 0. 

probability : a numerical value between 0 and 1 measuring the likelihood of occurrence 
of an event; the larger the number, the more likely the event. 

pseudo-random numbers: numbers generated in a predictable fashion, but that 
appear to behave like independent and identically distributed random numbers. 

purely multiplicative linear congruential generator: a widely used method of 
producing a stream of pseudo-random numbers. 

queue capacity : the maximum number of customers allowed at any time in a queueing 
system, either waiting or being served. 

queue discipline : the protocol according to which customers are selected for service 
from among those waiting for service. 

queueing system: a stochastic process in which customers arrive, await service, and 
are served. 

queue-length process: a stochastic process describing the number of customers in 
the queueing system. 

random numbers: real numbers generated uniformly over the interval (0, 1). 

random variable: a function that assigns a real number to each outcome in the sample 
space. 

random walk: a stochastic process based on the problem of determining the probable 
location of a point subject to random motions. 

recurrent state: a state of a Markov chain from which the probability of return to 
itself is 1. 

recurrent walk: a random walk that returns to its starting location with probability 1. 

reflecting boundary : a boundary that redirects the motion of a random walk whose 
trajectory comes into contact with it. 

relative error: the (percent) error in a computation relative to the true value. 

reliability : the probability that a given system functions at a random instant of time. 

roundoff error : the error resulting from abbreviating a number to the precision of 
the machine. 

sample size: the number of possible outcomes of an experiment. 

sample space: the set of all possible outcomes of an experiment. 
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series system: a system of components that operates only when all components op- 
erate. 

service-time distribution: the statistical distribution of time required to serve a 
customer in a queueing system. 

simple path: a path containing no repeated vertices. 

simulation: a technique for studying numerically the behavior of complex stochastic 
systems and estimating their performance. 

single-station queueing system: a system in which customers arrive, wait for ser- 
vice, and depart after service completion. 

s-t cutset: a minimal set of edges in a graph the removal of which leaves no s-t path. 

stability condition: the set of parameter values for which the queue-length process 
(or the waiting-time process) has a steady-state distribution. 

steady-state distribution: in a queueing system, the limiting probability distribution 
of the number of customers in the system. 

stochastic process: a collection of random variables, typically indexed by time (dis- 
crete or continuous). 

structure function: a binary-valued function defined on all subsets of components; 
its value indicates whether or not the system operates when the specified components 
all operate. 

traffic intensity : in a queueing system, the ratio of the maximum arrival rate to the 
maximum service rate. 

trajectory: the successive positions traced out by a particle undergoing a random 
walk. 

transient state: a state in a Markov chain from which the probability of return to 
itself is less than 1. 

transient walk: a random walk that is not recurrent. 

transition probability: the probability of reaching a specified state in a Markov 
chain by a single transition (step) from a given state. 

transition probability matrix : the matrix of one-step transition probabilities for a 
Markov chain. 

two-terminal network: a network in which two vertices (or terminals) are specified. 

two-terminal reliability : the probability that the specified vertices of a two-terminal 
network are connected by a path of operating edges. 

underflow: the result of a floating-point operation that is smaller than the smallest 
representable number. 

uniform random variable: the continuous random variable X £ (a, j3) with density 
function f(x) = 

variance (of a random variable) : a measure of dispersion of the random variable, equal 
to the average square of the deviation of the random variable from its expected value. 

variance reduction techniques: methods for obtaining greater precision for a fixed 
amount of sampling. 

waiting-time process: a stochastic process describing the time spent in the system 
by the customers. 
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7.1 FUNDAMENTAL CONCEPTS 


Definitions: 

An experiment is any physically or mentally conceivable undertaking that results in 
a measurable outcome. 

The sample space is the set f l of all possible outcomes of an experiment. 

The sample size of an experiment is the number of possible outcomes of the experiment. 
An event in the sample space 0 is a subset of 0. 

For a family of events { Aj \ j € J}, the union (J Aj is the set of outcomes belonging 

joJ 

to at least one Ay. the intersection ["] Aj is the set of all outcomes belonging to 

jeJ 

every Aj. 

The complement A of an event A is the set of outcomes in the sample space not 
belonging to A. 

The events A and B are disjoint if An B = 0. The events A\, A 2 , A3 , . . . are pairwise 
disjoint if every pair Ai, Aj of distinct events are disjoint. 

A probability measure on the sample space fl is a function Pr from the set of subsets 
of SI into the interval [0, 1] satisfying: 

• Pr(Sl) = 1; 

OO OO 

• Pr( U A) = 22 P r (A), if the events {Ak} are pairwise disjoint. 

fc= 1 fc= 1 

A fair ( unbiased ) coin is a coin that is just as likely to land Heads (if) as it is to land 
Tails (T). 

A red /blue spinner is a disk consisting of two sectors, one red with area r and one 
blue with area b. 

Facts: 

1. Pr(0) = 0. 

2. Pr{A ) has the interpretation of the long-run proportion of time that the event A 
occurs in repeated trials of the experiment. 

n n 

3. Pr( (J Ak) = 22 Pr(Ak), if the n events {Ak} are pairwise disjoint. 

fc= 1 fc= 1 

4. If all outcomes are equally likely and the sample space has k elements, where k is a 
positive integer, then the probability of event A is the number of elements of A divided 

\A\ 

by the size of the sample space; that is, Pr(A) — 

rv 

5. Principle of inclusion-exclusion ( simple form): For events A and B, 

Pr(A U B) = Pr{A) + Pr{B) — Pr(A n B). 

6. Principle of inclusion-exclusion ( general form): For any events A±, A 2 , . . . , A n , 

n 

Pr( u A r) = E PriA) - 22 Pr{A z D Aj) + J/ Pr(Ai D Aj n A k )- 

r= 1 i i<j i<j<k 

■ ■■ + (-i) n+1 Pr(Ai n a 2 n • • • n A n ). 
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7. Sieve principle : If A\, A 2 , . . . , A n are events, then 

n 

Pr(exactly k of the Aj occur) = (— l) r+fc (}) f' 1 H •• • D Aj r ). 

r=k ji <J*2 <■ ■ ■ <jr 

8. Boole’s inequality : If Ai, A 2 , . . . , A n are events, then 

n n 

Pr{ U A k ) < Y2 Pr{A k ). 

k —1 k = 1 

If ^4.2 5 • • • is an infinite sequence of events, then 

00 00 

Pr{ U A k ) < E Pr(A k ). 

fe= 1 fe=i 

(George Boole, 1815-1864.) 

9. Bonferroni’s inequality: If A \ 1 A 2 , . . . , A n are events, 

n n 

Pr( U A k ) > Pr{A k ) - J2 Pr{A k nAj). 

k = 1 fc=l k<j 

10. Pr(A) = l-Pr{A). 

11. Monotonicity : If A C B, then Pr(A) < Pr(B). 

12. If Ai C A -2 C A 3 C • • • is an increasing sequence of events, then 

OO 

lim Pr{A n ) = Pr( |J A n ). 

n ^°° n—l 

13. If A\ D A 2 D ^3 D ■ ■ ■ is a decreasing sequence of events, then 

OO 

lim Pr(A n ) = Pr( f) A n ). 

n ^°° n—l 

14. Web-based notes on basic probability concepts together with interactive experi- 
ments can be found at the site 

http : //www.math.uah. edu/~stat/ 


Examples: 

1. The following table gives examples of specific experiments, their sample spaces, and 
the corresponding sample size: 


experiment 

sample space 

sample size 

toss a coin 

{H, T} 

2 

toss a coin n times 

{ ©1, . . . ,u n ) | u>i is H or T} 

2 n 

roll a die 

{1,2,3,4,5,6} 

6 

roll a pair of dice 

{(1,1), (1,2),..., (6, 5), (6, 6)} 

36 

draw a card from a standard deck 

{2*,20,.., S A<?,A*} 

52 

spin a red/blue spinner 

{red, blue} 

2 


2. The following are various events defined for the experiment of rolling a pair of dice 
(see the table of Example 1): 

sum of dice is 9: A = {(3, 6), (4, 5), (5,4), (6, 3)} 

both dice are multiples of 3: B = {(3, 3), (3, 6), (6, 3), (6, 6)} 

sum of dice <4: C = {(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (3, 1)}. 
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3. The events A = {sum of dice = 9} and C = {sum of dice < 4} are disjoint. The 
events A t = {sum of the dice is i}, 2 < i < 12, are pairwise disjoint. 


4. Random selection of an integer: Let Q = {1,2,3 be the sample space 
corresponding to the experiment of randomly selecting an integer between 1 and n, and 
define Pr(j) = Pr({j}) = A. By Fact 4, Pr( 3 < j < n) = Pr({3 < j < n}) — — 


5 . For a red/blue spinner, the sample space is = {red, blue}. If the spinner is equally 
likely to land at any location, then Pr(red) = and Pr(blue) = 

6. Toss a fair coin n times and interpret Heads as 1 and Tails as 0. The sample space 
0 = { (u>i,u) 2 , ■ ■ ■ , oj n ) | ujj € {0, 1} } consists of all possible 0-1 sequences of length n. 
Since |0| = 2 n , each probability Pr((u> i, w 2 , • ■ ■ , w n )) is assigned the value ^ ■ By 
Fact 4, Pi' (A) = ^ holds for all A C fl. 

For example, the probability of no tails appearing in four coin tosses is the prob- 
ability of event A = {(1,1, 1,1)}, so Pr(A) = yg. The probability of exactly one 
tail is the probability of event B = {(0, 1, 1, 1), (1, 0, 1, 1), (1, 1, 0, 1), (1, 1, 1, 0)}; hence 
Pr(B) = A = i. The probability of at least two tails is, using Fact 10, 1 — Pr(A) — 
Pr(B) = 1 — yg = yg- 


7 . Derangements: Let D n be the set of derangements (§2.4.2) on the n elements 
{1, 2, . . . , n} and define Aj to be the set of all permutations fixing j. For any permuta- 
tion cr, Pr(cr) = P. Also, for ji < j 2 < ■ ■ ■ < jk, Pr{A h D A h (~1 • • • n A jk ) = (ra ~ fc)! and 
z jl<j2< ... <jk Pr(A h nA h n---n A jh ) = Q)^ = P By Fact 6, Pr(U^ =1 A r ) = 
Ei Pr{Ai) -Ei<i Pr(Ain A,-) + Ei<i<fc P^AnA^A^ - • • • + (-l)"+ 1 Pr(A 1 nA 2 n 

■■■nA„) = l-^T + jr h (“ 1 ) n+1 FT- Hence, Pr(D n ) = 1-1+P-P-- 

e~ 


(- 1 )" 


8. 5-card stud poker: Five cards are drawn from a well-shuffled deck of 52 playing 
cards. The sample space consists of the ( s 5 2 ) = 2,598,960 possible five-card hands. 
The approximate probabilities of various events are displayed in the following table. 
(See §2.3.2 Example 12 for further details.) As seen from the probabilities given in 
the table, obtaining a five-card hand containing three of a kind is approximately ten 
times more likely than obtaining a five-card hand containing a flush, which in turn is 
approximately ten times more likely than obtaining a five-card hand containing four of 
a kind. 


type of hand 

example 

hand enumeration 

probability 

one pair 

7V,7<>,K1. : J*,2V 

CO 

0.42 

two pairs 

74,7^,A©,A^,3* 


0.048 

three of a kind 

7*, 70?, 70, 30, 5* 


0.021 

straight 

7*, 84 , 90, 10*, JV 

10(4 5 - 4) 

0.0039 

flush 

30,60,7 0,J0,K<> 

4((?)-10) 

0.0020 

full house 

30, 30, 34 , 7*, 70 

13-12- (M 

0.0014 

four of a kind 

A*. AO, AO, A4, 74 

13-48 

0.00024 

straight flush 

70, 80, 90,100, AO 

o 

0.000014 

royal flush 

10O, JO,QO,ATO,AO 

4 

0.0000015 
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7.2 INDEPENDENCE AND DEPENDENCE 

Sequences of independent events are often encountered when an experiment is repeated 
(without changes) . Independent events correspond, intuitively, to events that do not af- 
fect the outcome of one another. The treatment of dependent events requires conditional 
probabilities. 


7.2.1 BASIC CONCEPTS 


Definitions: 

Two events A and B are independent if Pr(A n B) = Pr(A)Pr(B). 

The n events A 1 , A 2 , . . . , A n are independent if for all k (2 < k < n) and ji,j 2 , ■ ■ ■ ,jk 
(1 < ji < 32 < ■ ■ ■ < jk < n), 

Pr(Aj 1 n A h n • • • n A jk ) = Pr{A h )Pr{A h ) . . . Pr(A Jk ). 


The infinite collection of events { A n \ n > 1 } is independent if for all finite k >2 the 
events Ai, A 2 , . . . , A k are independent. 


Let B be an event with Pr(B) > 0. The conditional probability of A given B is 

Pr(A\B) = 

' 1 ' Pr{B) 


Facts: 

1. If events A and B are independent, then so are A and B, A and B , and A and B. 

OO OO 

2. If Ai, A 2 , . . . are independent events, then Pr( f) A k ) = J~[ Pr(A k ). 

fc= 1 fc= 1 

3. The function (f>B ■ A —> Pr(A\B) is a probability measure (§7.1). 

4. Pr{A D B) = Pr(A\B)Pr(B). 

5. If A and B are independent, then Pr(A\B ) = Pr(A). This equation captures the 
notion that for independent events A and B the knowledge that one of the events has 
occurred does not affect the probability of the other occurring. 

6. Pairwise independence of a collection of events does not necessarily imply that all 
events are independent (see Example 3). 

7. Law of total probabilities : For any event A and any partition of f l into events 
B u B 2 ,...,B n , 

n n 

Pr(A) = £ PriAnBi) = £ Pr(^|S i )Pr(B i ). 

i=l i=l 


8. Bayes ’ formula : For any event A and any partition of O into events B\, B 2 , 

Pr(Bi n A) _ Pr(A|Pi)Pr(Pi) 

Pr{A) E Pr{A\Bi)Pr{Bi) 

i= 1 


Pr(Bi\A) = 


(Thomas Bayes, 1702-1761.) 

n— 1 

9. Chain rule: For any events A 1; A 2 , . . . , A n satisfying Pr( f) ^4^) > 0, 

k - 1 


.,B 


ni 


Pr(A 1 n A 2 n • • • n A n ) = Pr{A{) Pr(A 2 \Ai) Pr^Ai n A 2 ) . . . Pr(A n 


n 


fc= 1 


A k ). 
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Examples: 

1. Tossing two fair coins: The sample space for this experiment consists of the four 
outcomes HH,HT,TH,TT. For example, the outcome HT means that the first coin 
turns up Heads and the second Tails. Because both coins are fair, all four outcomes are 
equally likely and in particular Pr(HT) = Since Pr(H) = Pr(T) = Pr(HT) = 
\ = \ ' \ = Pr(H)Pr(T). Thus, the events “Heads on the first coin” and “Tails on the 
second coin” are independent. 

2. Tossing a fair coin n times: As in Example 6 of §7.1, let 1 stand for Heads and 0 
for Tails. For each 1 < i < n select e* £ {0, 1} and define Ai = {ei occurs on the 
itli toss}. Since all outcomes are equally likely, Pr{A\ fl A 2 fl • • • D A n ) = (|) ra = 
\ x \ x • • • x \ = Pr(A 1 )Pr(A 2 ) . . . Pr(A n ). Also, for all ji, j 2 , ■ ■ ■ , jk (2 < k < n), 
Pr(Aj 1 D A h n • • • n A jk ) = (i) fc = §x|x---x| = Pr{A h )Pr{A h ) . . . Pr(A Jk ). 
Therefore the events A 1; A 2 , . . . , A n are independent. 

3. Let fl = {a, b , c, d} be a sample space with equiprobable outcomes. Let A = {a, b}, 
B = {a, c}, and C = {a,d}. Here Pr(A) = Pr(B) = Pr(C) = \. Also Pr(A fl B) = 
l = I ■ I = Pr(A)Pr(B ), Pr(A n C) = \ = \ ■ \ = Pr(A)Pr(C), and Pr(B nC) = 
\ = \-\= Pr(B)Pr(C). Yet Pr{A n B fiC) = \ ^ \ \ \ = Pr(A)Pr(B)Pr(C). In 
this example, any two of the events are independent, but all three are not. 

4. Gambler’s fallacy: Suppose that a fair coin is tossed five times, turning up Heads 
on all five tosses. What is the probability that the next (sixth) toss turns up Tails? A 
common fallacy is to believe that a Tail is more likely to turn up next, since in the long 
run 50% of the coins should turn up Tails (and 50% Heads). 

The appropriate sample space consists of 2 6 = 64 equiprobable outcomes, represent- 
ing any sequence of six Heads and/or Tails. The required probability is Pr(A\B), where 
A = {(H,H,H,H,H,T)} and B = {(H, H, H, H, H, H), (H, H, H, H, H, T)}. Then 
Pr{A\B) = = pI[b] = \ ■ Consequently, a Tail turning up next is just as likely 

as a Head. 


5. An urn contains 7 blue marbles and 5 red marbles. An experiment consists of draw- 
ing (without replacement) a marble at random, observing its color, and then drawing 
a second marble at random. Let Bj be the event “the itli marble drawn is blue” and 
let Ri be the event “the itli marble drawn is red”, where i £ {1,2}. Then Pr(Bi) = 
Pr{R 2 \Bi) = Pr(B\ n R 2 ) = Pr^BJPriBi) = n ' h = Ir B Y Fact 7: 

Pr(R 2 ) = Pr(R 2 \Ri)Pr(Ri) + Pr^B^Pr^) = ± ■ A + ^ . r_ = 


By Fact 8: 
P r (Bi\R 2 ) 


Pr(R 2 \Bi)Pr(Bi) 

Pr{R 2 \R\)Pr{Ri) + Pr(i? 2 |H 1 )Pr(H 1 ) 


A A 

11 ' 12 _ 7 

A . A i A . A ii • 
11 12 T 11 12 


6 . A particular family is known to have two children (one 9 years old, the other 10 
years old). When a census taker comes to the house, a girl answers the doorbell. What 
is the probability that the other child is also a girl? 

To answer this question, construct the sample space fl = {(b,b), (b, g), (g, b), ( g , g)}, 
where, for example, the ordered pair (6, g) means that the younger child is a boy and the 
older child is a girl. Assume that all four outcomes in the sample space are equiprobable. 
The required probability is Pr(A\B), where A = {(5,5)} and B = {( b,g ), ( g,b ), ( g,g )}. 
Then 


Pr(A\B) 


Pr(A n B) 
Pr{B) 


-PKKAff)}) 

Pr({(b,g),{g,b),{g,g)}) 



1 

3 ’ 
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7. Genetics: Genes are responsible for physical traits of all living things. Each gene 
is composed of two alleles. Dominant alleles are represented with capital letters and 
recessive alleles with lower case letters. The basic discoveries concerning genetics were 
made by Gregor Mendel (1822-1884). One of the genes that is responsible for eye color 
exhibits two alleles — a dominant one B , for brown eyes, and a recessive one b, for blue 
eyes. In a certain population the genotype probabilities are 
P?’(an individual has genotype BB) = 0.2 
P?’(an individual has genotype Bb) = 0.5 
P?’(an individual has genotype bb) = 0.3. 

Let E b be the event that an offspring receives a b allele from its mother, and let F b 
be the event that it receives a b allele from its father. Conditioning on the genotype 
( BB , Bb, bb) of the offspring produces 

Pr{E b ) = Pr(E b \BB)Pr(BB) + Pr{E b \Bb)Pr{Bb) + Pr(E b \bb)Pr(bb) 

= 0x0.2 + 0.5 x 0.5 + 1 x 0.3 = 0.55; 

similarly Pr(F b ) = 0.55. 

Let C be the event that the offspring has blue eyes (that is, has genotype bb). By 
independence, 

Pr{C) = Pr(E b n F b ) = Pr{E b )Pr{F b ) = (0.55) 2 = 0.3025. 


8. In Example 7, let A be the event that the father has blue eyes (i.e., has genotype bb). 
If the father has blue eyes, then the offspring will have blue eyes (event C) if and only 
if it receives a b allele from its mother, and so Pr(C\A) = Pr(E b ) = 0.55. Also, 
Pr(C | father is Bb) = Pr(C and mother is P&|father is Bb) + Pr(C and mother is bb \ 
father is Bb) = 0.25 x 0.5 + 0.5 x 0.3 = 0.275 and Pr(C (father is BB) = 0. 

The conditional probability that the father has blue eyes if the offspring has blue eyes 
is obtained from Fact 8 (interchanging phenotype with genotype when convenient): 

Pr(C|father has 66)Pr(father has bb) 

1 ’ = Pr(C\bb)Pr(bb) + Pr{C\Bb)Pr{Bb) + Pr{C\BB)Pr{BB) 


0.55 x 0.3 

0.55 x 0.3 + 0.275 x 0.5 + 0 


0.545. 


9. Let’s Make a Deal: A game show contestant is told there is a fabulous prize hidden 
behind one of three doors ( A , B, or C). The contestant guesses that the prize is behind 
door A. At this point the game show host (who knows what is behind each door, and in 
particular knows that the prize is not behind door B) opens door B, revealing that the 
prize is not there. The contestant is then offered the opportunity to change her guess. 
Should she? Intuition might suggest that nothing is to be gained by changing the guess 
(the prize, it is argued, is now equally likely to be behind either door A or door C). 
Using conditional probabilities, however, shows that it is definitely worthwhile to now 
guess that the prize is behind door C, assuming that the host is known to always open 
a door with no prize and to choose randomly if both remaining doors do not hide the 
prize. 

It is reasonable to assume that the prize is equally likely to be hidden behind 
each of the doors. Thus, if Hx denotes the event in which the prize is hidden behind 
door X, then = Pr{Hs) = Pr(Hc) = §. If Ox denotes the event that 

door X is opened, then Pt{Ob\Ha) = Pr{Oc\HA) = whereas Pr{Os\HB) = 0 and 
Pv{Ob\Hc) — 1- By Fact 8, 
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Pr{H A \0 B ) 


Pr{H A n 0 B ) 
Pr(0 B ) 


= Pr{H A )Pr(Q B \H A ) 

Pr(H A )Pr(0 B \H A ) + Pr(H B )Pr(0 B \H B ) + Pr(H c )Pr(0 B \H c ) 

1 1 

_ 3 ' 2 _ 1 

" M + 0 + I' 1 

Similarly Pr{Hc\0 B ) = |, so it is twice as likely for the prize to be hidden behind 
door C as behind door A, given that door B is shown to contain no prize. A web-based 
simulation of this situation, in which prizes are randomly hidden behind doors, enables 
one to verify experimentally this conclusion; see the following World Wide Web site: 

http : //www. intergalact . com/threedoor/threedoor .html. 


7.2.2 URN MODELS 

Several applications can be viewed as the result of placing balls into urns. 

Definitions: 

In the following models, k balls are randomly placed in n distinguishable urns labeled 1 
through n. 

• Model 1 ( Maxwell-Boltzmann ): The balls are distinguishable and multiple oc- 

cupancy is permitted. 

• Model 2: The balls are distinguishable and multiple occupancy is not permitted. 

• Model 3 ( Fermi-Dirac ): The balls are indistinguishable and multiple occupancy 

is not permitted. 

• Model 4 ( Bose-Einstein ): The balls are indistinguishable and multiple occupancy 

is permitted. 

• Model 5: The balls are distinguishable, no urn is allowed to remain empty, and 

multiple occupancy is permitted. 


Facts: 

1. The following table shows, for different urn models, the probability of the event 
(fci, /t' 2 , • • - , k n ), in which k\ balls are in urn 1, k% balls are in urn 2, . . . , k n balls are 
in urn n, with the restrictions ©” kj = k, kj > 0. In model 2, n— is a falling power 
(see §3.4.2). In models 2 and 3, every kj £ {0, 1} and the models are meaningful only if 
k < n. In model 5, every kj > 1, k > n, and { * } is a Stirling subset number (§2.5.2). 


model 

sample size 

enumeration of (k i, . . . , k n ) 

probability of (k \, ... ,k n ) 

1 

n k 

( k ) 

\k\ &2 ••• k n ) 

(fci k 2 ... k n ) n 

2 

n- 

k\ 

(zr 1 

3 

© 

1 

( Z )" 1 

4 

m 

1 

rrv 1 

5 


( k ) 

\k\ ••• k n / 

U k2 k ... J/©(©) 
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2. The Maxwell-Boltzmann model was originally proposed to explain the distribution 
of k subatomic particles into n different energy states. It has been replaced by the 
Bose-Einstein model (appropriate for particles with integer “spin” , such as photons and 
pi mesons) and by the Fermi-Dirac model (appropriate for particles with half-integer 
“spin”, such as protons and neutrons). 

3. Polya’s urn scheme (George Polya, 1887-1985): In this model, an urn contains b 
black and r red balls. At each step one ball is drawn and replaced, and c additional 
balls of the same color are placed in that urn. This scheme models the spread of a 
contagious disease where an infected person infects c other persons. 

4. The case c = 0 in Polya’s urn scheme corresponds to sampling balls with replace- 
ment . 

5. The case c = —1 in Polya’s urn scheme corresponds to sampling balls without 
replacement . 

6. The following table shows how to calculate several types of probabilities using 
Polya’s urn scheme, where b is the number of black balls, r is the number of red balls, 
and c is the number of additional balls added each time: 


event 

probability 

drawing a black 

drawing a black then red 

drawing in order black, red, black 

drawing k black and n — k red balls 
in a prescribed order 

drawing k black balls in n drawings; 
the order of drawing does not matter 

b 

b+r 

br 

(6+r)(6+r+c) 

6r(6+c) 

(6+r) (b+r-\-c) (6+r+2c) 
6(6+c)...(b+(/c— l)c)r(r+c)...(r+(n— k— l)c) 

(6+r) (6+r+c) ( b-\-r+2c ) . . . (6+r+(n— 1) c) 

rnra 

(-(b + fO/c) 


Examples: 

1. Partial derivatives: For analytic functions /, the order in which derivatives is taken 

does not matter. As an example, the mixed second partial derivatives f xy and f yx 
are equal, as are f xxy and f xyx . Consequently, the number of different third-order 
partial derivatives of a function of n variables is the number of ways to distribute k = 3 
indistinguishable balls into n urns (variables). Each such distribution corresponds to 
selecting the number of times each variable occurs in forming the partial derivative. 
Using the entry for Model 4 in the table for Fact 1, there are (”J 2 ) third-order partial 
derivatives of /. When n = 3 this gives (®) = 10 different third-order partial derivatives 
of f{x, y, z): namely, fxxxi fyym fzzzi fxxyi fxxzi fxyyi fyyzi fxzzi fyzzi fxyz- In general, 
there are different /cth-orcler partial derivatives of /. 

2. Model 3 provides a model for the occurrence of misprints on the pages of a book. 
Here the n urns correspond to the n symbols printed sequentially in the book and k 
is the number of misprints. Each symbol is either correct or a misprint, so multiple 
occupancy does not occur. 

Also, assuming that the misprints are not generated in a systematic fashion, the k 
balls can be considered indistinguishable, with misprints equally likely to occur at any 
location on the page. 
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3 . Lottery odds: A lottery is conducted by selecting five different numbers from 

1,2, This can be viewed using urn model 3, in which the five selected num- 

bers correspond to k = 5 identical balls placed into n = 9 distinguished urns. The 
number of such selections, by the table of Fact 1, is (j?) = 126. Only one of these 126 
selections matches all the five winning numbers, so Pr (match 5) = Og. 

To match exactly four of the five winning numbers, select the matching numbers 
in (®) = 5 ways and select the (single) nonmatching number in (^) = 4 ways, giving 
Pr (match 4) = |^| = 

To match exactly three of the winning numbers, select the matching numbers 
in (g) = 10 ways and select the two nonmatching numbers in (^) = 6 ways, giving 
Pr (match 3) = 

4 . In a number of state lotteries, k = 6 numbers are drawn from l,2,...,n. The 
following table gives the probability of matching exactly six, exactly five, and exactly 
four of the six winning numbers, for various values of n. 


n 

match 6 

match 5 

match 4 

35 

1/1,623,160 

87/811,580 

87/23,188 

36 

1/1,947,792 

15/162,316 

2,175/649,264 

37 

1/2,324,784 

31/387,464 

2,325/774,928 

38 

1/2,760,681 

64/920,227 

2,480/920,227 

39 

1/3,262,623 

66/1,087,541 

2,640/1,087,541 

40 

1/3,838,380 

17/319,865 

561/255,892 

41 

1/4,496,388 

35/749,398 

2,975/1,498,796 

42 

1/5,245,786 

108/2,622,893 

675/374,699 

43 

1/6,096,454 

111/3,048,227 

4,995/3,048,227 

44 

1/7,059,052 

57/1,764,763 

10,545/7,059,052 

45 

1/8,145,060 

39/1,357,510 

741/543,004 

46 

1/9,366,819 

80/3,122,273 

3,900/3,122,273 

47 

1/10,737,573 

82/3,579,191 

4,100/3,579,191 

48 

1/12,271,512 

21/1,022,626 

4,305/4,090,504 

49 

1/13,983,816 

43/2,330,636 

645/665,896 

50 

1/15,890,700 

22/1,324,225 

473/529,690 

51 

1/18,009,460 

27/1,800,946 

1,485/1,800,946 

52 

1/20,358,520 

69/5,089,630 

3,105/4,071,704 

53 

1/22,957,480 

141/11,478,740 

3,243/4,591,496 

54 

1/25,827,165 

32/2,869,685 

376/573,937 

55 

1/28,989,675 

98/9,663,225 

392/644,215 

56 

1/32,468,436 

25/2,705,703 

875/1,546,116 

57 

1/36,288,252 

17/2,016,014 

2,125/4,032,028 

58 

1/40,475,358 

52/6,745,893 

1,105/2,248,631 

59 

1/45,057,474 

53/7,509,579 

3,445/7,509,579 

60 

1/50,063,860 

81/12,515,965 

4,293/10,012,772 


5 . Let an urn contain c = 1 red ball and 6 = 9 black balls. In Polya’s urn scheme with 
c = 1, the probability of obtaining the sequence RRB (two red balls and then a black 
ball) is found using conditional probabilities as 

Pr(RRB) = Pr(B\RR)Pr(R\R)Pr(R) = ^ ■ £ • A. 
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Likewise, 


Pr(BRR) = Pr(R\BR)Pr(R\B)Pr(B) = £ ■ ± 

and 


Thus, 


Pr(RBR) = Pr(R\RB)Pr(B\R)Pr(R) = ^ ^ 


Pr(RRB) = Pr(BRR) = Pr(RBR) = Jg, 
agreeing with the value obtained using the table of Fact 6, with k = 1 and n = 3. 


The probability of obtaining two red balls and one black ball in some order is then 
Pr(RRB) + Pr(BRR) + Pr(RBR) = Using the extended binomial coefficients 
(§2.3.2), the corresponding entry in the table of Fact 6 can be verified for this case, 
where k = 1 and n = 3: 


Pr (exactly one black ball) = 


(©x©) (~ i ) i (;)(- i ) 2 (D 


(-3 10 ) 


(-l) 3 (a 2 ) 


agreeing with the value already found. 


9 

220 ’ 


7.3 RANDOM VARIABLES 


7.3.1 DISTRIBUTIONS 

Definitions: 

A random variable A is a real-valued function on a probability space U. 

The random variable A: Q — > 1Z is discrete if the range of X is finite or countable. 
The real-valued function /: 'R, 'JZ is a, density function if 

• f(x) > 0 for all x € 1Z\ 

• f-oo f( x ) dx = L 

The random variable X is ( absolutely ) continuous if there exists a density function / 
such that Pr{a < X < b) = f ^ f(x) dx for all a < b. 

The distribution /r x of the random variable X is given by /ix(B) = Pr(X £ B) for 
every interval B. 

The cumulative distribution function of a random variable X is given by F(x) = 
Pr(X < x). 

A random vector is a function X = (X \ , . . . , X^): O — > lZ k . 

The joint distribution Hx lt ...,x k of the random vector (Ai, . . . , X^) is defined by 
A >'X u ...,x h (Bi, . . . , B k ) = Pr( A'i e B lt . . . X k £ B k ) for any k intervals B lt . . . , B k . 

The random variables X k , . . . , X n are independent if for any intervals B k , . . . , B n 
Pr( A, £ Bi, . . . ,X n £ B n ) = Pr{X 1 £ B x ) . . . Pr(X n £ B n ). 

Facts: 

1. The cumulative distribution function F(x) is a nondecreasing function of x. 

2. linij, — A(x) 1, lim^ — ,_oo F(x^ 0. 

3. Pr(a < A < b) = F(b) — F(a) for a < b. 
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4. If X is a discrete random variable, then Pr(X = k ) = 1. 

5. If X is a continuous random variable, then -^F(x) = f{x). 

6. If X\ and A '2 are independent binomial random variables (see Table 1) with param- 
eters ni,p and ri 2 ,p respectively, then X\ + X 2 is also a binomial random variable with 
parameters n\ + ri2,P- 

7. If X\ and X 2 are independent Poisson random variables (see Table 1) with pa- 
rameters Ai and A 2 respectively, then X\ + X 2 is also a Poisson random variable with 
parameter Ai + A 2 . 

8. If Xi and X 2 are independent normal random variables (see Table 2) with parame- 
ters pi , <j\ and P 2 , respectively, then X\ + X 2 is also a normal random variable with 
parameters p\ + /x 2 , erf + a 2 . 

Examples: 

1. A spinner has three sectors — red, white, and blue — with sector areas 0.2, 0.7, 
and 0.1, respectively. Define a random variable A' according to the rule X = 1 if the 
spinner points on red, X = 2 if it points on white, and X = 3 if it points on blue. The 
distribution of the discrete random variable A is displayed in the following table. 


event 

i 

Pr(X = i) 

red 

1 

0.2 

white 

2 

0.7 

blue 

3 

0.1 


2. Bernoulli random variable : Let A C be a fixed subset, with Pr(A) = p for some 
0 < p < 1. Define the random variable A' by X(uj) = 1 for ui G A and X(ui) = 0 
for to ^ A. Often it is said that a success occurs whenever u> £ A and a failure 
occurs otherwise. Then A is a Bernoulli random variable with Pr{ X = 1) = p and 
Pr{X = 0) = 1 — p. (Jakob Bernoulli, 1654-1705) 

3. Binomial random variable: Suppose that a die is thrown and that the occurrence 
of either a one or a six results in a “success”. A single roll of the die constitutes a 
Bernoulli trial (Example 2) with probability of success p= The number of successes 
in 10 successive independent trials is a binomial random variable X with parameters 
n = 10 and p = ^ . In general, the number of successes A is a discrete random variable 
with possible values 0, 1, 2, . . . , n and its distribution is given in Table 1. 

4. A dart is thrown at a circular target of radius 1. Assume that the target is never 
missed and that any point on the target is as equally likely to be hit as any other 
point. Let X be the dart’s distance from the center of the target. Since A' can assume 
any value between 0 and 1, A is a continuous random variable. For 0 < a < b < 1, 
Pr(a < X < b) = Pr( the dart lands in the annulus with radii a and b)= ^x(the area 
of the annulus with radii a and b)= \{^b 2 — tto 2 ) = b 2 — a 2 . 

5. Hypergeometric random variable: A total of n balls are selected from an urn 
containing N balls, of which m are red and N — m are black. Let A be the number 
of red balls selected. Then X is a discrete random variable having the distribution 

Pr(X = k)= , for 0 < k < m. 

6. Multinomial random variable: Cast n identical balls into N labeled boxes in such a 

way that the probability that a ball ends up in box j is pj , where Pj = 1- Let Xj 

denote the number of balls in box j (1 < j < N). For a vector (xi,X 2 , . . . ,Xn) with 


© 2000 by CRC Press LLC 


Table 1 Discrete random variables. 


distribution 

description of event (X = k) 

range of X 

Pr(X = k) 

Bernoulli B(l,p) 

k = 0 indicates a failure, 

0,1 

q 


k = 1 indicates a success 


p 

binomial B(n,p) 

k successes in n trials, each with 
probability p of success 

0,1,. ..,n 

{l)p k q n - k 

Poisson P( A) 

k arrivals to a counter over a unit 

0,1,2,... 

e~ x \ k 

k\ 


period of time, at average rate A 


geometric G(p) 

k trials before first success occurs 

1,2,... 

q^p 

Pascal NB(r,p) 

k trials before rth success occurs 

r, r + 1, . . . 

( k r zlh k ~ r p r 

hypergeometric 

sample n items from N items, where 

0,1 ,...,n 

(DK) 

(?) 

(N, m, n) 

m are defective and N — m are not; 
k = number of defectives selected 




SjLi x .i = n i the probability that box 1 contains x\ balls, box 2 contains X 2 balls,. . . , 
box N contains Xn balls is given by 

Pr(X i =X!,X 2 = X 2 ,...,X N = x N ) = Xi]x X., xn \ PTp 2 2 ■■■P X n 

— ( n )n Xl n X2 n XN 
— Ixi x 2 ... x N )P 1 P 2 ■ ■ -Pn > 

expressed using the multinomial coefficients (§2.3.2). 

7. Joint distribution: Two fair coins are tossed once, resulting in four equally likely 
outcomes {(T, T), (T, H), (ff, T), (if, if)}. Let the random variable X be the total num- 
ber of heads observed, and let the random variable Y be the number of heads on the 
first coin minus the number of heads on the second coin. The joint probability dis- 
tribution p x,y is given by Pr(X = 0,1" = 0) = Pr(X = 1 ,Y = —1) = Pr(X = 
1,Y = 1) = Pr(X = 2,Y = 0) = Thus Pr{ X = 0) = Pr( X = 2) = 
Pr(X = 1) = | and Pr(Y = 1) = Pr{Y = —1) = Pr(Y = 0) = \. Since 
Pr(X = 0 ,Y = 0) = |t^| = Pr{X = 0 )Pr(Y = 0), the variables X and Y are not 
independent. 

8. Some important discrete random variables are described in Table 1. Here q = 1 — p. 

9. Some important continuous random variables are described in Table 2. Here it is 
understood that the density function f(x) = 0 outside the specified range. 


7.3.2 MEAN, VARIANCE, AND HIGHER MOMENTS 

Definitions: 

The mean ( expected value ) EX of a discrete random variable X is given by EX = 
J2kPr(X = k). 

k 

The mean of a continuous random variable X with density function / is given by 
EX = xf{x) dx. 

The variance Var(X) of a random variable X is Var(X) = E((X — EX) 2 ). 

The standard deviation of X is -y/Var(X). 
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Table 2 Continuous random variables. 


distribution 

range of X 

density function f(x) 

uniform (a, /?) 

(a,/3) 

0^,a</3 

exponential (A) 

[0, oo) 

Xe~ Xx , A > 0 

standard normal (0, 1) 

( — 00 , oo) 

_J_ e -x 2 /2 

V2ir 

normal (p,a 2 ) 

( — 00 , oo) 

5( ' } > 0 

gamma T(n, A) 

[0, oo) 

( n ) = f 0 °°t"-i e -*dt 

Cauchy (a) 

( — 00 , oo) 

/ 2 i 2 \ i ^ ^- > 0 

7 T(Ct Z +X Z ) ’ 

beta ( p , q) 

[0,1] 

^±gj®*>- 1 (i-*) fl - 1 -,P,«>o 

chi square y 2 (r) 

[0, oo) 

2’-/ 2 r(r/2) X2 ^ 2 > r > 0 

F-distribution F m>n 

[0, oo) 

r((m+n)/2) ( m \ m /2 x (.m- 2)/2 

r(m/2)r(n/2) V n / (1 +(ra/n):r)( m + n )/ 2 

f-distribution t^ 

( — 00 , oo) 

r((fe+l)/ 2 ) 1 1 

r(fc/ 2 ) ^(i+* 2 //c) (fc+ 1)/2 

Rayleigh R{a) 

[0, oo) 

„2 /0 2 

aie ' 2 _ 

CT 2 , cr > 0 


The covariance Cov(A, Y ) of two random variables A' and Y is given by Cov(A, Y) = 
E((X - EX)(Y - EY)). 


The correlation px,Y of two random variables X and Y is px,Y 
The kth moment of a random variable X is E(X k ). 


Cov(A, Y) 
y / Var(X)Var(y) ' 


Facts: 

1. The expected value EX of a random variable X measures the “weighted average” 
of X or the “center of gravity” of its distribution. 

2. E(X + Y) = EX + EY. 

3. E{cX) = cEX for all constants c. 

4. E(c) = c for all constants c. 

OO 

5. If A is a nonnegative integer random variable, then EX = Pr( A > n). 

71=0 

OO 

6. If A is a nonnegative continuous random variable, then EX = f Pr(X > x) dx. 

o 

7. If A and Y are independent, then E(XY) = (EX)(EY). 

8. If q is a real-valued function and A is a discrete random variable, then E(q(X )) = 
J2g(k)Pr(X = k). 

k 

9. If g is an integrable real- valued function and X is continuous with density /(&), 
then E(g(X)) = f g(t)f(t)dt. 

10. The variance Var(A) of a random variable A measures the “dispersion” of X about 
its expected value EX. 

11. Var(A) > 0; Var(A) = 0 if and only if for some constant c, Pr(X = c) = 1. 
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12. Var(cX) = c 2 Var(AT) for all constants c. 

13. Var(X + Y) = Var(X) + Var(K) + 2Cov(X, Y). 

14. Var(A + Y) = Var(X) + Var(U) if A' and Y are independent. 

15. Var(A) = E(X 2 ) - (EX) 2 . 

16. Cov(X, Y) = 0 if X and Y are independent. The converse is false. (Example 4.) 

17. Cov(X,F) = E(XY) - (EX)(EY). 

18. The correlation px,Y is a scale-invariant measure of the degree of linear relationship 
between two random variables X and Y. Specifically, px,Y = 1 only when Y = aX + b 
for some constants a > 0 and b. Similarly, px.Y = — 1 only when Y = a A + b for some 
constants a < 0 and b. 

19. | px, Y \ < 1. 

20. Bienayme-Chebyshev’s inequality : Pr( \X — EX \ > t) < Vy "^ 2 X) , for any value 
t > 0. (Irenee-Jules Bienayme, 1796-1878 and Pafnuty Lvovich Chebyshev, 1821-1894.) 

21. Kolmogorov’s inequality: Suppose X lt X 2 , . . . , X n are independent random vari- 
ables, and let Sk = Xi + X 2 + ■ ■ ■ + Xk for 1 < k < n. Then for any value t > 0 
the probability that \Sk — ESk\ < t holds for all k = 1, 2, . . . , n is at least 1 — Val ^ 2 s " ^ . 
(Andrey Nikolayevich Kolmogorov, 1903-1987.) 

Examples: 

1. The random variable X is the number of heads obtained in three tosses of a fair 
coin. It follows a binomial distribution (Table 1, §7.3.1), with n = 3 and p = Thus 
Pr(X = 0) = |, Pr( X = 1) = |, Pr( X = 2) = |, and Pr( X = 3) = |. Using the 
definition of expected value, EX = kPr(X = fc)=0-| + l- | + 2- | + 3- |= |. 
In general, the mean of a binomial distribution with parameters n and p is np; see the 
corresponding entry in Table 3 of §7.3.3. 

2. The variance of the discrete random variable X in Example 1 can be found using 

Var(A) = E(( X - EX) 2 ) = E(( X - §) 2 ) = (0 - f) 2 ; | + (1 - |) 2 • § + (2 - f ) 2 • | + 
(3 — |) 2 • = |. In general, the variance of a binomial distribution with parameters n 

and p is np( 1 — p); see the corresponding entry in Table 3 of §7.3.3. 

3. Suppose X is a Bernoulli random variable with parameter p , so Pr( X = 0) = 1 — p 
and Pr( X = 1) = p. Then EX = 0 • (1 — p) + 1 • p = p and Var(A) = E((X — p) 2 ) = 
(0 - p) 2 ■ (1 - p) + (1 - p) 2 ■ p = p 2 ( 1 - p) + p( 1 - p) 2 = p( 1 - p). Also, E( X 2 ) = 
0 2 • (1 —p) + 1 2 -p = p and using Fact 15 Var(X) = E( X 2 ) — (EX) 2 = p — p 2 = p(l—p), 
as before. 

4. Covariance and independence: In Example 7 of §7.3.1, EX = 0- | + l- | + 2- j = l 
and UK = -l-i + 0-i + l-i = 0. Also, E(XY) = -l-|+0-i + l-i = 0. By Fact 17, 
Cov(A, Y) = E(XY) — (EX)(EY) = 0 — 1 • 0 = 0. In this example, variables A' and Y 
have zero covariance (and zero correlation); however (see Example 7, §7.3.1) they are 
not independent random variables. 

5. The moments of the normal random variable X with parameters p = 0 and o = 1 
are E( X 2k ) = 1 • 3 • • • (2k — 1) and E(x 2k ~ 1 ) = 0 for k > 1. 

6. A manufacturing plant produces ball bearings with an average diameter of 50 mm 
and a variance of 11 mm 2 . Without any further information about the shape of the 
distribution of the diameters X, Fact 20 shows that the probability Pr(|X — 50 1 > 8) 
of exceeding the nominal diameter by more than 8 mm is no more than Va A-*-) = 14 = 
0.172. Thus, no more than 17.2% of the ball bearings produced can exceed the stated 
tolerance. 
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7. Average-case algorithm analysis: A simple algorithm for locating an item in an 
(unordered) list A = [ai, « 2 , . . . , a n \ is called a linear search ; it sequentially examines 
each entry of list A and compares the given item key with each a*, until a match is 
found, or until the entire list is searched, in which case key is known not to be in 
the list. To obtain the average case complexity of this algorithm, suppose that key is 
known to occur in A and that it is equally likely (with probability — ) to be at each of 
the n positions of A. If key is in fact located at position k of A, then k comparisons 
are required by the algorithm. The expected number of comparisons needed is thus 
EX = EL i kPr(X = k) = EL i k- ± = l ELi k = E '^r 1 = Consequently, 
the average-case complexity of linear search is O(n); see §1.3.3. 


7.3.3 GENERATING FUNCTIONS 
Definitions: 

The probability generating function of a discrete random variable X is the function 

</>(£) = E{t x ) = JjPr(X = k)t k , defined for |£| < 1. 
k 

The moment generating function of a discrete random variable X is the function 
ijj(t) = E(e tx ) = JJ) e tk Pr(X = k), defined for all t such that ip(t) converges. 

k 

The moment generating function of a continuous random variable X with density / 
is the function ip(t) = E(e tx ) = J e tx f(x) dx, defined for all t such that ^(t) converges. 

The characteristic function (Fourier transform) of a discrete random variable X is 

X(t) = E{e Ltx ) = JO e ltk Pr{X = k), defined for all t £ 1Z. 
k 

The characteristic function of a continuous random variable X with density / is 
X(f) = E(e ltx ) = / e ltx f(x) dx, defined for all t £1Z. 


Facts: 

1. The expected value of a random variable X can be expressed in terms of the first 
derivative of its generating function: EX = <f>'( 1) = '<//((]) = —iy'i 0). 

2. The variance of a random variable X can be expressed in terms of the first and 
second derivatives of its generating function: Var(A) = ^>"(1) + </>'( 1) — [<//(l)] 2 = 

V/'(o)-[v/(o)] 2 = [xW-x"(o). 

3. J^(l) = E(X(X - 1)(A -2 )...(X-k+ 1)). 

4. ^V’(O) = E(X k ). 

5. £ x (0) = i k E{X k ). 

6. For any of the three types of generating functions defined, the generating function of 
the sum of independent random variables is the product of their respective generating 
functions. 

Examples: 

1 . The binomial random variable with parameters n and p is the sum of n independent 
Bernoulli random variables with parameter p. The probability generating function for 
a Bernoulli random variable is <j>(t) = E(t x ) = t°(l — p) + t l p = q + pt, where q = 
1 — p. By Fact 6 the probability generating function for a binomial random variable is 

m)] n = (q + pt) n . 
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Table 3 Moments and generating functions for discrete distributions. 


distribution 

mean 

variance 


i>(t) 

x(t) 

Bernoulli B(l,p) 
Binomial B(n,p) 

Poisson P(A) 
geometric G(p) 

Pascal NB(r,p) 

hypergeometric 
( N , m, n ) 

P 

np 

A 

l 

p 

r 

V 

mn 

N 

pq 

npq 

A 

JL 

p2 

rq 

p2 

m(N—m)n(N—n) 

N 2 (N-1) 

q + pt 

(q+pt) n 
e A(t- 1) 

Pt 

1 — qt 

( Pt \r 

V 1-qt ) 

* 

q + pe* 

(< q + pe t ) n 
e A(e*-l) 

pe t 
l—qe* 

( pe* \ r 
a 1— qe t > 

* 

q + pe lt 

(q + pe lt ) n 

e A(e“-l) 

pe l£ 

1 —qe rt 

( Pe xt \r 
\ 1 —qe** / 

* 


Table 4 Moments and generating functions for continuous distributions. 


distribution 

mean 

variance 

ip(t) 

x(t) 

uniform (a, (3) 

Q+/3 

(/3-a) 2 

e 0t -e a * 


2 

12 

t(P-a) 

it(/3—a ) 

exponential (A) 

1 

A 

1 

A 2 

A 

A —t 

A 

A —it 

standard normal (0, 1) 

0 

i 

e i 2 /2 

e -P/2 

normal (/i, a 2 ) 

M 

a 2 

e pt+^l 


gamma T(n, A) 

n 

A 

n 

A 2 


( A 1” 

V A -it> 

Cauchy (a) 

OO 

OO 

OO 

X e -^\t\ 

OL Z 

beta (p, q) 

P 

pq 

* 

* 

p+q 

(p+q) 2 (p+q+ 1) 

chi square x 2 (r) 

r 

2 r 

(1 - 2f)- r /2 

(1 - 2 it)~ r / 2 

F-distribution F m ^ n 

n 

2n 2 (m+n— 2) 


* 

n— 2 

m{n— 2) 2 (n— 4) 


f-distribution tk 

0 

k 

k—2 

* 

* 

Rayleigh R(a) 


2(1 ~ f )°’ 2 

* 

* 


2. Table 3 shows the mean, variance, probability generating function, moment gener- 
ating function, and characteristic function of several important discrete distributions. 
Here q = 1 — p. An asterisk (*) signifies that the entry is not available in simple form. 

3. Table 4 shows the mean, variance, moment generating function, and characteristic 
function of several important continuous distributions. An asterisk (*) indicates that 
the entry is not available. 

4. The moments of a binomial random variable X can be found from its moment 
generating function = ( q+pe l ) n . For example, = n(q + pe t ) n ~ 1 (pe t ) and by 
Fact 1 EX = i/)'( 0) = n(q + p) n ~ 1 p = np. 

5. From Table 4 the moment generating function for the exponential distribution with 

parameter A is ip(t) = yzy- Then tp'(t) = (\~t) 2 anc ^ = (\-t ) 3 > giving ^(O) = 

jp = j and ip"(0) = = j 2 . By Facts 1 and 2, EX = ip'(0) = j and Var(A) = 

n o)-[^(o)] 2 = ^-(i) 2 = ^. 
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7.4 DISCRETE PROBABILITY COMPUTATIONS 

Many discrete probability computations are much less straightforward than may at first 
be imagined because of difficulties arising from the finiteness of computer arithmetic 
systems. Good algorithm design can usually avoid such problems. 


7.4.1 INTEGER COMPUTATIONS 

Enumeration of combinatorial objects, such as permutations and combinations, requires 
the computation of integer factorials. In practice, these factorials can only be computed 
as integers for small values. Since in most cases the factorials are not themselves the 
primary objective of the computation, potential numerical difficulties can be overcome 
by carefully designed recursive algorithms. 

Definitions: 

The TV-bit binary representation of the positive integer n is (6 jv-i^jv-2 ••• ^ 1 ^ 0)2 
where 6, € {0, 1} and n = ^i 2 *- (See §4.1.3.) Each 6,; is a binary digit (bit). 

The two’s complement representation of the signed integer n is ( bpf-ibN -2 ■ ■ ■ &ifro) 2 / 
where n = — 6jv- i2 JV_1 + YiL &»2*. 

Integer wraparound is the phenomenon of adding 1 to the largest representable integer 
and obtaining the smallest representable integer. 

Facts: 

1. Signed integers are usually represented in a computer as two’s complement binary 
words of a fixed wordlength; commonly 8, 16, or 32 bits are used. 

2. A two’s complement integer using TV- bit words is interpreted by treating the most 
significant bit as a coefficient of 2 JV ~ 1 . 

3. The range of representable integers in TV-bit two’s complement is from — 2 W ~ 1 = 
(10 . . . 00) 2 , to 2 n ~ 1 — 1 = (01 . . . 11) 2 /- Arithmetic operations can generate no carries 
beyond this range. 

4. Integer wraparound is a consequence of Fact 3 since (in regular binary arithmetic) 
(01 . . . 11) 2 + (00...01) 2 = (10...00) 2 . Some systems have integer range checking 
available to avoid the effect of wraparound. 

5. Permutations and combinations (§2.3) are usually expressed in terms of integer 
factorials. 

6. Binomial coefficients (§2.3.2) can be computed using integer arithmetic provided 
the result is within the range being used. Algorithm 1 breaks the computation of a 
binomial coefficient into a recursive loop using = k ,^!_ k y = n< ' n ~k(k-i)^i +1 ' > = 

(f) (atO ' • ' ( ' H+ l~ k ) and the reSult (fc) = in-k)- ( See § 2 - 3 - 2 > Fact 7.) 

7. By doing the multiplication before the integer division in Algorithm 1, the numerator 
necessarily has the appropriate factors to ensure an exact integer result. 
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Algorithm 1 : Integer computation of binomial coefficients. 

input: positive integers n, k 
output: b = (") 

if 2k > n then k := n — k 
b := 1 

for i := 1 to k do 

b \= [b ■ (n+1 — *)] div i 


Examples: 

1. For N = 8, 16, and 32 bits, the two’s complement binary integer ranges are given 
in the following table: 


N 

minimum value 

maximum value 

8 

-128 

127 

16 

-32768 

32767 

32 

-2147483648 

2147483647 


2 . For N = 8, the integer 86 has the two’s complement representation (01010110)2' 
and the integer —86 has the two’s complement representation —128 + 42 = (10101010)2'- 

3 . For 8-bit integers, the effect of integer wraparound is shown by 127 + 1 = —128. 
Similarly, we would have 64 x 2 = —128. 

4 . For 8-bit two’s complement integers, only 1!, 2!, 3!, 4!, 5! can be computed correctly. 
Subsequent factorials would generate integer answers — but wrong ones. In particular, 6! 
would evaluate to —48. Namely, the 8-bit two’s complement representation of 5! = 120 
is 01111000 and 6 = 00000110 so that, with no carries to the left of the 8th bit, 6! = 6x5! 
is represented by the sum of 11100000 and 11110000 (which are respectively 01111000 
shifted 2 and 1 places left.) This sum (again without carries to the left of the leading 
bit) is 11010000, which represents (—128) + 64 + 16 = —48. 

5 . Using 16-bit integers with wraparound, the binomial coefficient (g 2 ) cannot be com- 
puted directly from its definition since neither 12! nor 8! can be computed correctly. 
Thus Algorithm 1 finds instead ( 4 ) = (g 2 ) since 2 x 8 > 12. This is computed as 
(t) (^t) ( 3 ° ) (|)> with each multiplication being performed before its associated divi- 
sion: (12/1) is multiplied by 11, divided by 2, multiplied by 10, divided by 3, multiplied 
by 9, and divided by 4. This produces the intermediate results 12, 132, 66, 660, 220, 
1980, 495, so that the correct final result is obtained without any intermediate compu- 
tation exceeding the integer range. 


7.4.2 FLOATING-POINT COMPUTATIONS 

To compute discrete probabilities (e.g., binomial probabilities), careful attention must 
be given to the underlying floating-point computation model and its properties. 

Definitions: 

Let F be the set of numbers representable in a particular floating-point system, with 12 
the largest positive number in F and u> the smallest positive number in F. 

The floating-point arithmetic operations in F are denoted by ®, 0 , 0,0 when it is 
necessary to distinguish them from their real counterparts +, — , x, /. 
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The number x is represented in the computer in binary floating-point form by the 
approximation x ~ ±/ x 2 E where the fraction or mantissa f is a binary fraction 
of fixed length and the exponent E is an integer within a fixed range. Usually the 
floating point representation is normalized so that / £ [1,2). 

Floating-point arithmetic is subject to roundoff error, the error introduced by abbre- 
viating the representation of a number or an arithmetic result to a finite wordlength. 

The usual measure of error for floating-point computation is relative error , which is 

lx* — x\ \x* — x\ 

given for an approximation x* to a quantity x by - — . — . — « — . — . — . 

M f*| 

A floating-point operation (or flop) is any arithmetic operation performed using 
floating-point arithmetic. 

Overflow results from a floating-point operation where the magnitude of the result is 
too large for the available range of the floating-point system being used. 

Underflow results from a floating-point operation where the magnitude of the result 
is too small for the available range of the floating-point system being used. 

The machine unit p of a floating-point system is the smallest positive number that 
can be added to 1 and produce a result recognized in the machine as greater than 1: 
namely, p = min{ x £ F \ 1 © £ > 1 }. 


Facts: 

1. Roundoff errors are propagated in subsequent computations. 

2 . The two expressions given for relative error are often used interchangeably. 

3 . Overflow and underflow result from the finite range of available exponents. The 
limits of these ranges and the details of the implementation vary with both the hardware 
and software being used. See [IE85] for the most common implementations. 

4 . Usually an overflow condition terminates a program, while underflow results are 
normally replaced by 0. 

5 . Because of the finite mantissa length (and independent of the rounding rule), most 
axioms of the real number system fail for floating-point arithmetic [St74] . Table 1 
summarizes similarities and differences between the real numbers TZ and the floating- 
point system F. The second column of the table describes the property, assuming 
a, b, c £ 1Z. If the property fails in F, a brief reason for the failure is also given. 

6. In Table 1, most of the properties that fail in F hold approximately — at least 
for arguments of the same sign. These failures are not critical to most computations, 
but they can be important for computations such as summing sets of numbers and 
evaluating binomial probabilities. 

7. The existence of the machine unit p ensures that some of the order properties of 7 Z 
will not carry over to F. 

8. The machine unit p is not the same as the smallest representable positive number u 
in F. 

9 . The relative error in subtraction is essentially unbounded due to cancellation. 

10. IEEE arithmetic is required to deliver the same result as if rounding were performed 
on the infinite precision computation assuming that the data are exact. 

11 . A sum of terms of the same sign should generally be summed from smallest to 
largest. 
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Table 1 Properties of 7Z and F. 


property 

description in 1Z 

valid in F? 

closure + 

cl b (E 1Z 

NO: overflow 

closure x 

a x b € 7Z 

NO: overflow 

commutativity 

a + b = b + a, a x b = b x a 

YES 

associativity + 

(a + b) + c = a + (b + c) 

NO: a = 1, b = c = (( 

special case 

(a + b) — a = b 

NO: a = 1, b = ^ 

associativity x 

(a x b) x c = a x (b x c) 

NO: roundoff, overflow, 
or underflow 

distributive law 

a x (b + c) = (a x b) + (a x c) 

NO: roundoff, overflow, 
or underflow 

existence of zero 

(3 0) a + 0 = a 

YES 

unique negative 

O 

II 

'cT 

+ 

<3 

ffi 

NO: [- (1 © p) <g) a] © a 
= 0 if p x a < w 

existence of one 

(3 1) a x 1 = a 

YES 

zero divisors 

ax& = 0=>a = 0or6 = 0 

NO: a ® b = 0 => 

a < \/Zj or b < y/Zj 

total ordering 

a < b or a = b or a > b 

YES 

order-preservation 

a>b=ya + c>b + c 

NO: roundoff 

special case 

x>0=>l + x>l 

NO: x < p 


Algorithm 2: Recursive computation of binomial probabilities. 

input: positive integers N, k; real number p 
output: s = B(N,p\ k ) 

q := 1 -P 
t := q N 
s := t 

for i := 1 to k do 

s := s + t 


12. Improved accuracy in computing a summation is possible by regarding the partial 
sums as members of a (reduced) list of summands and always adding the two smallest 
terms in the current list. However, the overhead would be prohibitive in most cases. 

13. Special care must be taken in computing a summation if its terms are computed 
recursively, since the smallest term can underflow. 

14. There is no completely reliable method for summing terms of mixed sign. 

15. For alternating series, special transformations such as Euler’s method can be used 
([BuTu92], Chapter 1). 

16. Algorithm 2 computes the cumulative sum of binomial probabilities (§7.3.1) using 

the definition B(N,p; k) = 0 {^)p l (l — p) N ~ l - 

17. Algorithm 2 will only work for small values of N. 

18. If k is not too large, Algorithm 2 does in fact sum terms from smallest to largest. 
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Algorithm 3: Logarithmic computation of binomial probability terms. 

input: positive integers N, r; real number p 
output: b= ( N r )p r {l -p) N ~ r 

q-=l-p 

t := r * lnp + ( N — r) In q 
for i := 1 to r do 

t := t + In (TV + 1 — i) — In i 
b := e t 


19. To compute B(N,p\k) for large values of k, use the fact that B(N,p\k) = 1 — 
B(N, 1— p;N — k — 1) and then apply Algorithm 2. 

20. If q N underflows to 0, then Algorithm 2 returns 0 for all values of k. 

21. Algorithm 3 gives an alternative way to calculate the individual binomial proba- 
bility term (^)p r (l ~p) N ~ r . It computes the logarithm of each factor recursively and 
then exponentiates this at the end. 

22. Algorithm 3 must be safeguarded to ensure that e* underflows to 0 for large negative 
arguments t. 

23. Using logarithms is a frequently applied technique for computing products of many 
factors with widely varying magnitudes. It is one step along the way toward using the 
symmetric level-index scheme for number representation and arithmetic [C101Tu89]. 


Examples: 

1. Summations: If the first 2 24 terms of the harmonic series are summed using IEEE 
single precision floating-point arithmetic, both forward and backward, then the sums 
differ by approximately 11%. Specifically, (• • • ((l ® |) ® |) ® • • • ® 2 -24 ) « 15.40, 
while summing the same terms from right-to-left yields 17.23. 

2. Binomial probabilities : The computation of binomial probabilities is thoroughly 
discussed in Section 2.6 of [St74] with reference to the specific case where N = 2000, 
k = 200, and p = 0.1. Using Algorithm 2 in this case gives the initial value t = 
0 and therefore the final result is s = 0. The true value of the final probability is 
approximately 0.5. 

3. If the final term in Example 2 is computed in IEEE single precision, the binomial 
coefficient itself would overflow. It is certainly greater than 10 200 . Also, both (O.l) 200 
and (0.9) 1800 would underflow. However the true value of this term is around 0.03. 


7.5 RANDOM WALKS 

Random walks are special stochastic processes whose applications include models for 
particle motion, crystallography, gambling, stock markets, biology, genetics, and as- 
tronomy. This section examines random walks whose trajectories are generated by the 
summation of independent and identically distributed discrete random variables. 
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7.5.1 GENERAL CONCEPTS 


Definitions: 

A stochastic process is a collection of random variables, typically indexed by time 
(discrete or continuous). 

A d-dimensional random walk is a stochastic process on the integer lattice Z d 
whose trajectories are defined by an initial position So = a and the sequence of sums 
S n = a+ Xi + X 2 + ■ ■ - + X n , n > 1, where the displacements X lt X 2 , ■ . . are independent 
and identically distributed random variables on Z d . 

A random walk is simple if the values X t are restricted to the 2d points of Z d of 
unit Euclidean distance from the origin. (That is, the random walk proceeds at each 
time step to a point one unit from the current point along some coordinate axis.) A 
symmetric random walk is a simple walk in which the 2d values of Xi have the same 
probability. 

Random walks that return to the initial position with probability 1 are recurrent ; 
otherwise they are transient. 

An absorbing boundary is a point or a set of points on the lattice that stops the 
motion of a random walk whose trajectory comes into contact with it. A reflecting 
boundary is a point or a set of points that redirects the motion of a random walk. Both 
are special cases of an elastic boundary, which stops or redirects the motion depending 
on some given probability. 

The gambler’s ruin problem is a simple one-dimensional random walk with absorbing 
boundaries at values 0 and b. It colorfully illustrates the fortunes of a gambler, who 
starts with a dollars and who at each play of a game has a fixed probability of winning 
one dollar. The game ends once the gambler has either amassed the amount S n = b or 
goes broke S n = 0. 

For k £ Z d , the first passage time into point k is the first time at which the 
random walk reaches the point k : namely, Tj- = min{i > 1 | Si = k}. More generally, 
the hitting time Ta for entering set A C Z d is the first time at which the random 
walk reaches some point in set A: namely, Ta = min{< > 1 | Si £ A}. 

The following is the basic initial problem of random walks: 

• For k£Z d , find Pr(S n = k), the probability that a “particle”, executing the 

random walk and starting at point a at time 0, will be at point k at time n. 

The following are first passage time problems: 

• Find the probability Pr(X/ = n ) that, starting at point a at time 0, the first visit 

to point k occurs at time n. 

• Find the probability Pc(Ta = n) that, starting at point a at time 0, the first visit 

to A occurs at time n; characterize St a , the point at which A is first visited. 

Other classical problems in random walks include: 

• Range problem: Find or approximate the probability distribution and/or the 

mean of the number of distinct points visited by a random walk up to time n. 

• Occupancy problem: Find or approximate the probability distribution and/or the 

mean of the number of times a given point or a set of points has been visited 
up to time n. 

• Boundary problem: Address all previous problems under absorbing, reflecting, 

and/or elastic boundary conditions. 


© 2000 by CRC Press LLC 



Examples: 

1. Coin tossing: Tossing a coin n times can be viewed as a one-dimensional random 
walk ( d = 1) on the integers Z. This walk begins at the origin (a = 0) with X, = 1 if 
the result of the itli toss is a Head and X t = — 1 if the result of the ith toss is a Tail. 
Since each step X, is of unit length, this is a simple one-dimensional random walk. If 
the tosses are independent events, then Pr(Xi = 1) = p and Pr(X, ; = —1) = 1— p holds 
for all i, where 0 < p < 1. The random variable S n is the cumulative number of Heads 
minus the cumulative number of Tails in n tosses. The walk is symmetric if p = f . A 
return to the origin means that S n = 0: that is, the number of Heads and Tails have 
equalized after n tosses. 

2. Gambler’s ruin: A gambler repeatedly plays a game of chance, in which a dollar 
is won at each turn with probability p and a dollar is lost with probability 1 — p. For 
example, suppose the gambler starts with 90 dollars, and stops whenever his current 
fortune is 0 (a ruin) or 100 (a positive net gain of 10 dollars). What is the gambler’s 
ultimate probability of being ruined? Of success? On average how many expected plays 
does it take for the game to be over? What is the expected net gain for the gambler? 
If p = 0.5 the answers are 0.1, 0.9, 900, and 0 respectively. If p = 0.45 they are 0.866, 
0.134, 765.6, and —76.6 respectively. (See §7.5.2, Example 4.) 


7.5.2 ONE-DIMENSIONAL SIMPLE RANDOM WALKS 

A number of results are known for random walks in one dimension that take a succession 
of unit steps (in either the positive or negative direction). 

Definitions: 

The one-dimensional simple random walk (see §7.5.1) corresponds to a particle 
moving randomly on the set Z of integers. It begins at the origin at time 0 and at each 
time 1 , 2 ,... thereafter, moves either one step up (right) with probability p, or one step 
down (left) with probability 1 — p. This random walk is symmetric when p = ^ . 

The trajectory of a one-dimensional simple random walk is described by So = 0 and 
S n = X\ + Xi + • • • + X n , n > 1, where the X; are independent and have a Bernoulli 
distribution (§7.3.1), with Pr(Xj = 1) = p and Pr(X^ = —1) = q = 1 — p for p G (0, 1). 

Suppose a trajectory is graphically represented by plotting S n as a function of n, so 

that the point (n, k) corresponds to S n = k. Linking successive points with straight 
lines produces a path between points. Define N(n, k ) to be the number of paths from 
(0, 0) to (n, k). 

In a random walk starting at So = a > 0 with absorbing boundaries at 0 and b > a: 

• q a is the probability that the random walk will be absorbed at 0; 

• p a is the probability that the random walk will be absorbed at b ; 

• D a is the time until absorption. 

Facts: 

1. Reflection principle: Let n 2 > ni > 0, k\ > 0, fc 2 > 0. The number of paths 
from (m, k±) to (n 2 , fc 2 ) that touch or cross the rr-axis equals the number of paths from 
(ni,-ki) to (n 2 , fc 2 ). 

2. N(n, k) = (( n+ fe)/ 2 )> ^ 2 ^ is an integer in {0, 1, ... , n}; N(n, k) = 0 otherwise. 

3. If n > 1 is fixed and — n < k < n, then Pr(S n = k ) = N(n , fc)p5("+ fc )q3( rl-fc ). 
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4. Ballot theorem : For k > 0, the number of paths from (0,0) to ( n,k ) that do not 
return to or cross the x-axis is ^ N(n , k). 

5. For n > 1, the first return time To to the origin satisfies: 

. Pr(T 0 >n) = E{^)- 
.Pr(T 0 = 2n) = ^( 2 »- 

• Pt{Tq > 2 n) = Pr(S 2n = 0) = ( 2 ")2 _2ra , if the walk is symmetric. 

6. Recurrent walks: Pr(To < oo) = 1 (the walk is recurrent) if and only if p = q = 

In this case E(T 0 ) = oo. 

7. For k ^ 0 and n > 0, Pr{T k = n) = ^ Pr(S n = k). 

8. For k > 0 and n > 0, the maximum value M. n = maxjS'o, Si, . . . , S n } satisfies 

• M n > k if and only if T ' k < n; 

. Pr(M. n >k) = Pr(S n = k)+ E [1 + (lY- k }Pr(Sn = t); 

i>k + 1 

• Pr(M. n = k) = Pr(S n = k) + Pr(S n = k + 1), if the walk is symmetric. 


9. Arc sine laws: Let W n be the number of times among among {0,1,..., ?z} at which 
a random walk is positive and let L n be the time of the last visit to 0 up to time n. For 
a symmetric random walk: 

• Pr(W 2n = 2k) = Pr(L 2n = 2k) = Pr(S 2k = 0 )Pr(S 2n . 2k = 0); 

• as n — > oo, Pr( < x) ~ ^ arcsin y/x, for x G [0, 1]. 


10. Gambler’s ruin problem: In this random walk with absorbing boundaries (§7.5.1), 
q a is the probability of the gambler (having an initial capital of a) being ruined and p a 
is the probability of eventually winning (achieving a total of b). Facts 11-17 refer to 
the gambler’s ruin problem. 

11 . If p ± <?, q a = and p a = 1 - q a . 

12. If p = q = I, q a = 1 - | and p a = f . 

13. The expected gain in the gambler’s ruin problem is 6(1 — q a ) — a, which is 0 if and 
only if p = q = ) . 


b-l 


14. Pr(D a =n) = 6-i2>(™- a )/y™+“)/ 2 £ cos"" 1 ^ sin ^ sin 

fc= l 

15. If p ? q ,E{D a ) = ^-^±=%$. 

16. If p = q = E(D a ) = a(b — a). 

17. Limiting case of the gambler’s ruin problem: When b = oo: 

• q U; = (|) a if p > q, and q a = 1 otherwise; 

• Pr(D a = n) = 2 n p( n - a ' > / 2 q( n+a ^ 2 cos" -1 irx sin7ra: sin7raa: dx 

— °l( n 1 r)5( n_a )ol( n + a ). 

“ n\(n+a)/2)P 1 > 

• if p < q, E{D a ) = ^- p , 

• if p = q = 1, E(D a ) = oo. 

18. Random walks with one reflecting boundary: Consider a random walk starting at 
So = a > 0 with a reflecting boundary at 0. 

• The position at time n > 1 is given by S n = max{0, S^-i + X n }. 

• When p < q and as n — > oo, there is a stationary distribution for the random 

walk, coinciding with the distribution of M = sup i>0 Si, and given by Pr(M = 
k) = (1— |)(|) fc for all fc > 0. 
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Examples: 

1. A graphical representation of the trajectory for a one-dimensional simple random 
walk is shown in the following figure. Here T 0 = 2 and T 2 = 4; M 3 = 1 and M 4 = 2; 
Wj = 4; and L 7 = 6. 


s„ 



2 . The ballot theorem takes its name from the following problem. Suppose that, in 
a ballot, candidate A scores x votes and candidate B scores y votes, x > y. What is 
the probability that, during the ballot, A is always ahead of B? By Fact 4, the answer 
is As an illustration, if —7 — = 0.52, this probability is 0.04. 

3 . How much time does a symmetric random walk spend to the left of the origin? 
Contrary to intuition, with large probability, the fraction of time spent to the left (or 
to the right) of the origin is near 0 or 1, but not near 7. 

For example, when is large, Fact 9 shows that the probability a symmetric random 
walk spends at least 97.6% of the time to the left of the origin is approximately 0.1 = 
| arcsin ©0.024. Symmetrically, there is a 0.1 probability that it spends at least 97.6% 
of the time to the right of the origin. Altogether, with probability 0.2 a symmetric 
random walk spends at least 97.6% of the time entirely on one side of the origin. 


4 . Gambler’s ruin: Suppose that the probability of winning one dollar is p = 0.45, so 
q = 0.55. A gambler begins with an initial stake of a = 90 and will quit whenever the 
current winnings reach b = 100. Using Fact 11, the probability of ruin (i.e., losing the 
entire original stake) is 


(11 /9) 100 - (11/9) 90 
(11/9) 100 — 1 


0 . 866 . 


The expected net gain is, by Fact 13, 6(1 — q a ) — a = 100(0.134) — 90 = —76.6. The 
average duration (number of plays of the game) is found from Fact 15 to be 765.6 plays. 
Surprisingly, even though the probability p of winning is only slightly less than 0.5 and 
the gambler starts within close reach of the desired goal, the gambler can expect to be 
ruined with high probability and the average number of plays of the game is large. 


5 . The average duration of a fair game in the gambler’s ruin problem is considerably 
longer than would be naively expected. When one player has one dollar and the adver- 
sary 1000 dollars, Fact 16 shows that the average duration is 1000 trials. 


7.5.3 GENERALIZED RANDOM WALKS 

Two generalizations of one-dimensional random walks are covered here. In the first 
case, a one-dimensional walk is now allowed to be based on an arbitrary (as opposed to 
Bernoulli) distribution. In the second case, a symmetric random walk is considered in 
higher dimensions. 
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Definitions: 

In a one-dimensional random walk on Z with So = a , let u a be the probability that the 
particle arrives at a position < 0 before reaching any position > 6, where b > 0. 

Let R n be the number of distinct points visited by a random walk up to time n. 


Facts: 

1. If Xi,X 2 , . . . are arbitrarily distributed independent random variables, many basic 
qualitative laws are preserved for one-dimensional walks. 

• In the case of two absorbing boundaries, the particle will reach one of them with 

probability 1. 

• In the case of a single absorbing boundary at 0, if E(Xj) < 0 then the particle 

will reach 0 with probability 1. 

• an unrestricted walk with E{X. j) = 0 and Var(Xj) < oo will return to its initial 

position with probability 1, and the expected return time is infinite. 

2. General ruin problem: Assume that at each step the particle has probability pk to 
move from any point i to i + k, where k € Z. The particle starts from position a. 

• u a = 1 if a < 0, and u a = 0 if a > b. 

• For 0 < a < 6, u a = u iPi-a- This corresponds to a system of b — 1 linear 

equations in b — 1 unknowns that has a unique solution. 

3. Local central limit theorem: For a d-dimensional symmetric (simple) random walk: 

. \Pr(S n = k)- 2(^ n ) dl2 e -^\ < 0(n-( d + 2 )/ 2 ); 

. \Pr(S n = k)~ 2(^ n ) dl2 e- dJ ^\ < \k\- 2 0(n~ d / 2 ). 

4. Polya’s theorem: For the symmetric random walks in 1 or 2 dimensions, there 
is probability 1 that the walk will eventually return to its initial position (recurrent 
random walk). In dimension d > 3 this probability is strictly less than 1. (George 
Polya, 1887-1985.) 

5. For symmetric random walks in d = 3 dimensions, the probability of an eventual 
return to the initial position is approximately 0.34 and the expected number of returns 
is approximately 0.53. The following table gives the approximate return probabilities 
Pr(T 0 < oo) for dimensions d < 10. 


6 . 


d 

Pr{To < oo) 

3 

0.341 

4 

0.193 

5 

0.135 

6 

0.105 

7 

0.0858 

8 

0.0729 

9 

0.0634 

10 

0.0562 


Range problems: As n — > oo: 

• if d = 1, £(i?„) « (— ) 1/2 ; 

7 r 


• if d = 2, E(R n ) 


7r n 
logn’ 


• if d > 3, E(R . n ) « Cdn for some constant Cd- 
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Example: 

1. Absorbing boundaries: A particle starts at position a £ {0, 1, 2, 3,4} on a line and 
with equal probabilities moves one or two positions, either to the right or left. Upon 
reaching a position x < 0 or x > 4, the particle is stopped. This is a form of the general 
ruin problem with 2 = p~ 1 = Pi = P 2 = \- We are interested in the probability u a of 
absorption at position x < 0, given that the particle starts at position a. Using Fact 2, 
uo = 1, U 4 = 0 and u\,U 2 ,u 3 satisfy the following equations: 

ui = \ + \u 2 + \u z 

u 2 = \ + jU3 

U 3 = \ui + \u 2 . 

Solving this linear system produces the unique solution u\ = u 2 = 5, 113 = 
Intuitively, these are reasonable values since starting at the middle position a = 2 it 
should be equally likely for the particle to be absorbed at either boundary; starting 
at position a = 1 there should be a greater chance of absorption at the left boundary 
(probability jq) than at the right boundary (probability 1 — ^ = yg). 


7.5.4 APPLICATIONS OF RANDOM WALKS 

Random walk methodology is central to a number of diverse problem settings. This 
section describes several important applications. Additional examples are found in 
[BaNi70], [Be93], and [We94], 

Examples: 

1. Biological migration : The name “random walk” first appears in a query sent by 
Karl Pearson (1857-1936) to the journal Nature in 1905. Pearson’s problem refers to 
a walk in the plane, with successive steps of length h,l 2 , ... at angles 0i, ©2 , . . . with 
respect to the a;-axis, the ©i chosen randomly. The problem is to find, after some fixed 
time, the probability distribution of the distance from the initial position. The question 
was motivated by a theory of biological migration which Pearson developed at that 
time, but soon discarded. Nevertheless, Pearson’s random walk was born and it has 
since been applied to many biological models. 

2. Biology: Other, more recent, examples of random walk applications include DNA 
sequencing in genetics, bacterial migration in porous media, and molecular diffusion. 
In the latter example, diffusion of molecules occurs as a result of the thermal energy of 
the molecules. The motion of the molecules, perturbed through interactions with other 
molecules, is then modeled as a random walk. See [Be93] for further details. 

3. Physical sciences: There are many applications in the physical sciences, including 
the classical Scher-Montroll model of electrical transport in amorphous semiconductors 
(a continuous-time random walk model), models of diffusion on tenuously connected 
structures such as percolation clusters, inference of molecular structure from data col- 
lected in x-ray scattering experiments, configurational statistics of polymers, and reac- 
tion kinetics in confined geometries. Details can be found in [BaNi70] and [We94]. 

4. Sequential sampling: A major application of random walks in statistics is in con- 
nection with Wald’s theory of sequential sampling. In this context, the X, represent 
certain characteristics of samples or observations. Measurements are taken as long as 
the random walk remains within a given region. Termination with acceptance or rejec- 
tion of the appropriate hypothesis occurs depending on which part of the boundary is 
reached. 
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5. Stock prices: One of the early applications of computers in economics was to analyze 
economic time series and, in particular, the behavior of stock market prices over time. 
It first came as a surprise when Kendall found in 1953 that he could not identify any 
predictable patterns in stock prices. However, it soon became apparent that random 
price movements indicated a well-functioning or efficient market and that a random 
walk could be used as a model for the underlying market. See [Ma90]. In fact, at the 
beginning of this century, Bachelier had already developed a diffusion model for the 
stock market. Other macroeconomic time series have also been modeled using random 
walks. 

6. Astronomy: The problem of the mean motion of a planet in the presence of pertur- 
bations due to other planets has a very long history (Lagrange). Statistical properties 
of perturbed orbits can be analyzed with the help of Pearson’s random walk model. 
The escape of comets from the solar system has also been modeled as a random walk 
among energy states. Details are provided in [BaNi70]. 


7.6 SYSTEM RELIABILITY 

System reliability involves the study of the overall performance of systems of intercon- 
nected components. Examples of such systems are communication, transportation, and 
electrical power distribution systems, as well as computer networks. 


7.6.1 GENERAL CONCEPTS 
Definitions: 

Suppose a given system is composed of a set N = {1, 2, ... , n} of failure-prone compo- 
nents. At any instant of time, each component is found in one of two states: either 
operating or failed. 

The reliability of component i is the probability pi that component i is operating at 
a given instant of time. The unreliability (or failure probability) of component i is 
qi = 1 - Pi- 

At any instant of time, the system is found in one of two states: operating or failed. 

The structure function 0 is a binary-valued function defined on all subsets S C N. 
Specifically, 4>(S) = 1 if the system operates when all components in S operate and all 
components of N — S fail; otherwise </>(£) = 0. 

The structure function 0 is monotone if S C T => 0(5) < 0(T). In words, monotonic- 
ity means that the addition of more operating components to an already functioning 
system cannot result in system failure. 

The structure function 0 is nontrivial if 0(0) = 0 and 0(TV) = 1. 

A coherent system ( N , 0) has a structure function 0 that is monotone and nontrivial. 
The dual of the system (TV, 0) is the system (TV, 0 D ), defined by <fi D (S) = 1 — cf>(N — S) . 

The reliability Rn, 4 , of the system ( N,q i) is the probability that the system functions 
at a random instant of time: Rn,</> = Pr{4>{S) = 1). 

The unreliability Un of the system is given by Un, 4 > = TV(0(S) = 0) = 1 — Rn,</>- 
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Facts: 

1. If the state of any component is statistically independent of the state of any other 
component, then the probability that S C TV is precisely the set of operating components 

is given by prob(S') = ]1 Pi II Qj- 
i&S jgS 

2. The dual of the dual of a system (N, <j>) is the original system (N, (j>). 

Examples: 

1. Consider a system built from the set of components N = {1, 2, 3, 4}. Associate with 
each component a known weight: w\ = 5, W 2 = 7, W 3 = 4, W 4 = 8. If S C N is the 
set of operating components, then the system is considered to operate if w i > 12. 
Thus 4>(S) = 1 for precisely the following sets S: 

{1,4}, {2,4}, {1,2,3}, {1,2,4}, {1,3,4}, {2,3,4}, {1,2, 3, 4}. 

In all other cases = 0. This structure function is nontrivial and monotone, so that 
(N, (j>) is a coherent system. The reliability of the system is then 

Rn,4> = Pl<?2<73P4 + qiP2qsP4 + P 1 P 2 P 394 + PlP2<?3P4 + Pl<?2P3P4 + q\P2P3Pi + PlP2P3P4- 

The dual system (N, <fi D ) has the structure function (f> D , where <j> D (T) = 0 for precisely 
the following component sets T : 

0,{1}, {2}, {3}, {4}, {1,3}, {2,3}. 

In all other cases <fi D (T) = 1. For example, 4> D {{ 1, 3}) = 1 — </>({2, 4}) = 1 — 1 = 0. 

2. For critical financial transactions, calculations are carried out simultaneously by 
three separate microprocessors. The three results are compared and the result is ac- 
cepted if any two of the processors agree (or all three agree). Here the system has 
components {1,2,3} corresponding to the three microprocessors. A component fails 
if it gives the wrong answer, and the system fails if this “majority rule” produces an 
incorrect (or inconclusive) answer. Thus, 4>(S) = 1 if and only if S is {1,2}, {1,3}, 
{2,3}, or {1,2,3}. 

This structure function is nontrivial and monotone, so the system is coherent. If 
the microprocessors are identical and each operates independently with probability p, 
then Pr(exactly two components work) = 3p 2 (l — p), Pr (all components work) = p 3 , 
and 

Rn, 4 > = 3p 2 (l - p) + p 3 = 3 p 2 - 2 p 3 . 

For example, if p = 0.95 then Rn, 4 > = 0.99275. In this case, even though any single 
microprocessor has a 5% failure rate, the system as a whole has only a 0.7% failure rate. 

3. Telephone network : The components of this system are individual communication 
links (or trunk lines) joining nearby locations. Any telephone call that is placed between 
two distant locations in this system needs to be routed along available communication 
links. 

However, as a result of hardware or software malfunctions, or as a result of over- 
loaded circuits, certain links may be unavailable at a given instant of time. Thus, a 
telephone network can be modeled as a system whose components are subject to failure 
at random times. 

The reliability of the entire system is the probability that the system functions at a 
random instant of time; that is, that at any random point in time users can successfully 
complete their calls. 
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7.6.2 COHERENT SYSTEMS 


It is assumed throughout this subsection that the system ( N , <j>) is coherent. 

Definitions: 

A minpath P is a minimal set of components such that <j>(P) = 1: i.e. , 4>(P) = 1 and 
c f>(S ) = 0 for all proper subsets S C P. The collection of all minpaths for (N, <j>) is 
denoted V. 

A mincut C is a minimal set of components such that 4>{N — C) = 0: i.e., <j>(N — C) = 0 
and <p(N — S) = 1 for all proper subsets S C C. The collection of all mincuts for (TV, <fi) 
is denoted C. 

Let G = (y,E) be an undirected graph with vertex set V and edge set E (§8.1.1). 

• A simple path in G is a path that contains no repeated vertices. 

• An s-t cutset of G is a minimal set of edges, the removal of which leaves no s-t 

path in G. 

• A cutset of G is a minimal set of edges, the removal of which disconnects G. 

If I\ C V, a K-tree is a minimal set F of edges in G such that every two vertices of K 
are joined by a path in F. 

If K C V, a K-cutset is a minimal set of edges in G, the removal of which disconnects 
some pair of vertices in K. 


Facts: 

1. The dual of a coherent system is itself coherent. 

2. The structure function of a coherent system ( N,<j > ) can be completely described 
using its minpaths V or using its mincuts C. Specifically: 

• (j){S) = 1 if and only if S contains some minpath P; 

• (f>(S) = 0 if and only if TV — S contains some mincut C. 

3. The minpaths of (TV, <f>) are the mincuts of the dual (N, <p D ), and conversely. 

4. The mincuts of (N, </>) are the minpaths of the dual (N, <p D ), and conversely. 

5. Every minpath of (N, (f>) and every mincut of (IV, 4 > ) have nonempty intersection. 

6. If P is a minimal set of components that has nonempty intersection with every 
mincut of (N, </>), then P is a minpath of (N, </>). 

7. If C is a minimal set of components that has nonempty intersection with every 
minpath of (N, 4>), then C is a mincut of (N, (j>). 

8. A AT-tree has the topology of a tree (§9.1.1) whose leaf vertices are in K. 

Examples: 

1. Series system : This system (TV, </>) operates only when all components of N operate. 
See the following figure. General characteristics are listed in Table 1. 
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Table 1 Characteristics of series and parallel systems. 



series 

parallel 

structure 

0{S) = 0, if S C TV 

<KS) = 1, if S ± 0 

function 

0(TV) = 1 

<K0) = 0 

minpaths 

Pi = TV 

P\ = {1}j P 2 = {2}, . . . ,P n = {n} 

mincuts 

Ci = {1}, C 2 = {2}, ...,C n = {n} 

© = TV 

reliability 

P 1 P 2 ■■■Pn 

1 - (1 - Pi)(l - Pi) ■ ■ ■ (1 - Pn) 

unreliability 

1 -P 1 P 2 ■■■ Pn 

(1 ~Pl)(l -P2) • ■ ■ (1 ~Pn) 

dual 

parallel, n components 

series, n components 


Table 2 Characteristics of fc-out-of-n systems. 



k-out-of-n success 

fc-out-of-n failure 

structure 

function 

minpaths 

mincuts 

reliability 

unreliability 

dual 

= 1 if |S| > k 
tis) = 0 if \S < k 

S C TV with IS'I = k 

S C TV with 151 = n — k + 1 

]T{ prob(5) | \S\ >k} 

X;{prob(5) | \S\ < k} 
fc-out-of-n failure 

4>(S) = 0 if \S\ < n — k 

4>(S) = 1 if |5| > n — k 

S C TV with |S| = n — k + 1 

S C TV with \S\ = k 
]T{prob(TV- S) \ |5| < k} 
E{ P rob(TV-5) | |5| > k} 

k-out-oi-n success 


2. Parallel system : This system (TV, </>) fails only when all components of TV fail. See 
the following figure. General characteristics are listed in Table 1. 



3. k-out-of-n success system : This system (TV, </>) operates only when at least k out of 
the n components operate. The following figure illustrates a 2-out-of-3 success system. 
General characteristics are listed in Table 2. The special case k = 1 gives a parallel 
system; k = n gives a series system. 



4. k-out-of-n failure system: This system (TV, (jf) fails only when at least k out of the n 
components fail. This is the same as an (n — k + l)-out-of-n success system. General 
characteristics are listed in Table 2. The special case k = 1 gives a series system; k = n 
gives a parallel system. 
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5. Two-terminal network: Two vertices s,t of an undirected graph G = (V. E) are 
specified and a message is to be sent from vertex s to vertex t. Assume that only the 
edges are failure-prone, so N = E. The system operates when there exists some path 
of operating edges joining s to t in the graph. 

• structure function: 4>(S) = 1 if there exists a path from s to i in the subgraph 

defined by edges S; < j>(S ) = 0 otherwise; 

• minpaths: all simple s-t paths of G; 

• mincuts: all s-t cutsets of G; 

• reliability: Rn, 4 > = the probability that a message sent from s will arrive at t = 

]P{ prob(S') | S contains some simple s-t path}; 

• unreliability: Un,^ = ]C{ P ro b(iV — S) \ S contains some s-t cutset }. 

6. All-terminal network: A message is to be disseminated among all vertices V in the 
undirected graph G = (V,E). The system operates when the operating edges in the 
graph allow all vertices to mutually communicate. 

• structure function: <j>(S) = 1 if the subgraph defined by vertices V and edges S 

is connected; </>(£) = 0 otherwise; 

• minpaths: all spanning trees (§9.2) of G; 

• mincuts: all cutsets of G; 

• reliability: Rn,</> = probability that G is connected = J}{prob(S') | S contains 

some spanning tree of G } ; 

• unreliability: Un,<p = prob(lV — S) \ S contains some cutset of G }. 

7. K-terminal network: A message is to be disseminated among a fixed subset K of 
vertices in the undirected graph G = (V,E). The system operates when the operating 
edges of the graph allow all vertices in K to mutually communicate. 

• structure function: (j)(S) = 1 if the subgraph defined by vertices K and edges S 

is connected; (j>{S ) = 0 otherwise; 

• minpaths: all .A-trees of G; 

• mincuts: all Abcutsets of G; 

• reliability: Rn, 4 > = probability that K is connected = J}{ prob(S') | S contains 

some A'-tree of G }; 

• unreliability: Un^ = ]C{ prob(iV — S) \ S contains some A'-cutset of G}; 

• special cases: K = {s,t} gives the two-terminal network problem; K = V gives 

the all-terminal network problem. 

8. Examples 5-7 are defined in terms of undirected networks. The two-terminal, all- 
terminal, and AT-terminal reliability problems described in these examples can also be 
defined for directed networks. 

9. Consider the coherent system (N,<j>) on components N = {1,2, 3, 4, 5} with min- 
paths P\ = {1,2},P2 = {1,5},P3 = {3,5},Pi = {2,3,4}. To illustrate Fact 2, notice 
that 0({3,5}) = 1 and <(>({2,3,5}) = 1 since P3 is a minpath. Also, <{>({2,5}) = 0 since 
{2,5} contains no minpath. By Fact 7, Gi = {1,3} is a mincut for this system since 
G 1 has nonempty intersection with each of Pi, P2 , . . . , P4 and since neither {1} nor {3} 
has this property. Likewise, G2 = {2,5} and G3 = {1,4,5} are mincuts for this system. 
Fact 4 shows that the dual (N, (j> D ) has as its minpaths the mincuts of (N, <fi): namely, 
{1, 3}, {2, 5}, and {1, 4, 5}. This means ^({l, 3, 4}) = 1 since {1, 3, 4} contains the min- 
path {1,3}. Alternatively, from the definition ^({l, 3, 4}) = 1 — <{>({2 , 5}) = 1 — 0 = 1. 
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7.6.3 CALCULATING SYSTEM RELIABILITY 


Four general approaches can be used to calculate the reliability Rn, 4 > of a coherent sys- 
tem (TV, </>). These are state-space enumeration, inclusion-exclusion, disjoint products, 
and factoring. 

Notation: 

Let V = {P\, P 2 , . . . , P m } be the minpaths and C = {C \ , C' 2 , . . . , C r } be the mincuts of 
the coherent system ( TV© ). 

• E, is the event that all components of minpath Pi operate (with no stipulation 

as to the states of the other components); 

• Fi is the event that all components of Cj fail (with no stipulation as to the states 

of the other components); 

• Ei.Ej denotes the event that both Ei and Ej occur; 

• (TV, </> +i ) is the system derived from (TV, (f > ) in which component i always works; 

• (TV, </>_,:) is the system derived from (TV, </>) in which component i always fails. 

Facts: 

1 . Calculation of the reliability Rn, 4 > is in general quite difficult; it is a #P-complete 
problem. 

2 . R n i4> = E{ prob(5') | (j>(S) = 1 }. 

3. Rn,c/> = E { prob(iS') I S contains some P € V }. 

4. Rn, 4 > = 1 — Un,<i> = 1 — Ei prob(TV — S) \ S contains some C £ C}. 

5. State-space enumeration: System reliability can be found by enumerating all oper- 
ating (or all failed) states of the system, using Facts 2-4. 

6 . Pat ,0 = Pr(E 1 U E 2 U • • • U E m ). 

7. U N ^ = Pr(F 1 UF 2 U---UF r ). 

8 . Applying the inclusion-exclusion principle (§2.4) to Fact 6 produces 

Rn ,0 = Ei Pr(Ei) - Ei<j Pr(EiEj) + ■■■ + (-l) m+ 1 Pr(P 1 P 2 . . . E m ). 

9. Applying the inclusion-exclusion principle (§2.4) to Fact 7 produces 

U N ,0 = Ei Pr(Fi) - E i<i Pr(FiF 0 ) + • • • + (-l)’-+ 1 Pr(P 1 P 2 . . . F r ). 

10. Inclusion-exclusion: This approach calculates system reliability using Facts 8-9. 

11. Rn,4> — Pf(Ei) + Pr(EiE 2 ) + • • • + Pr(EiE 2 . . . E m _iE m ). 

12. U N ^ = Pr(F 1 ) + Pr{F 1 F 2 ) + • • • + Pr{F 1 F 2 . ..F r ^F r ). 

13. Disjoint products: This approach calculates system reliability using the law of 
total probabilities (§7.2.1). (See Facts 11-12.) 

14. Rn,4> = Pi.RN,c/> +i + (1 — Pi)^AT, 0 _i- 

15. Factoring: Rather than requiring an enumeration of the minpaths or mincuts of 
the system, this method (based on Fact 14) concentrates on the state of an individual 
component i: it is either operating (with probability pj) or failed (with probability 
q% = 1 ~Pi)- 

16. The factoring method is applied most productively when the system (TV, (/>) has 
additional structure. For example, this approach can be used to determine the reliability 
of /c-out-of-n systems and two-terminal networks. (See §7.6.4.) 
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Examples: 

1. State-space enumeration: Consider the coherent system (TV, </>) with N = {1, 2, 3, 4} 
and minpaths Pi = {2,3}, P 2 = {1,2,4}, P 3 = {1,3,4}. By Fact 2 of §7.6.2, the oper- 
ating states of the system are {2, 3}, {1, 2, 3}, {1, 2, 4}, {1, 3,4}, {2, 3,4}, and {1, 2, 3,4}. 
By Fact 3, R NtC/> = qip 2 p 3 g 4 + PiP 2 PzqA + PiP 2 q 3 Pi + Piq 2 PsPi + q\P 2 P 3 Pi + P\P 2 P 3 Pa- 

2. Inclusion-exclusion: Consider the coherent system on N = {1,2, 3, 4} with minpaths 
Pi = {1,2}, Pi = {2,4}, P 3 = {1,3,4}. Event E\ has probability p\Pi, event EiE 2 has 
probability P 1 P 1 P 4 , etc. Fact 8 gives 

Rn, 4 > = PlP 2 + P 2 PA + P1P3P4 - P 1 P 2 PA - P 1 P 2 P 3 PA - PlP2P3Pi + PlP2P3Pi 
= P 1 P 2 + P 2 Pi + P 1 P 3 PA - P 1 P 2 P 4 - P1P2P3P4- 

3. Disjoint products: Fact 11 is applied to the coherent system on N = {1, 2, 3, 4, 5, 6} 
with minpaths Pi = {1,5}, Pi = {1,3,6}, P 3 = {2,4,5}, P4 = {2,6}. For simplicity 
of notation, let the event {e operates} be denoted by e, and let the event {e fails} be 
denoted by e. Identities of set theory (Table 1, §1.2.2) can then be used to obtain 

Pr(E 1 ) =pip 5 ', 

Pr{EiE 2 ) = Pr{{ I U 5)136) = Pr(5136) = pip 3 g 5 p 6 ; 

Pr(EiE 2 E 3 ) = Pr((l U 5)(I U 3 U 6)245) = Pr(I(I U 3 U 6)245) 

= Pr(1245) = qiPiPAPs] 

Pr(PiP 2 P 3 P 4 ) = Pr{{ l U 5) (I U 3 U 6) (2 U 4 U 5)26) 

= Pr ((I U 5) (I U 3) (4 U 5) 26) 

= Pr((l U 35)(4 U 5)26) 

= Pr ((14 U 15 U 35)26). 

Since the events 14, 15, and 35 above are not disjoint, Fact 11 can be reapplied to this 
new union of events, yielding 

Pr (P1P2P3P4) = Pr((14 U 415 U 135)26) 

= Pr(1426 U 41526 U 13526) 

= q\P2q4P& + qiP2P4qsP3 + PiP2q3qsPe- 
The final expression for system reliability is then 

-Rjv,0 = PlPS + PlP3qsP6 + qiP2P4P5 + qiP2q4P6 + qiP 2 P±qbP 6 + PlP2q3q5P6- 


7.6.4 SPECIALIZED ALGORITHMS FOR CALCULATING RELIABILITY 

The general methods for calculating system reliability discussed in §7.6.3 can often be 
streamlined when the system has a special structure. This section describes algorithms 
for calculating the reliability of series-parallel, fc-out-of-n, and certain network systems. 

Definitions: 

The two-terminal reliability of a network G is the probability R s t{G ) that vertices s 
and t are connected in G. 

The all-terminal reliability of a network G is the probability Ry(G) that G is con- 
nected. 
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Algorithm 1: Two-terminal reliability for undirected networks, 
procedure Rst(G) 

perform series and parallel reductions on G, producing network H 
if H consists of the single edge (s, t) then return the reliability of (s, t) 
else 

select an edge e from H 

let p e be the reliability of e in H 

return p e R st (H/e ) + (1 - p e ) R st (H - e ) 


If G is a two-terminal network with specified vertices s and t, then an irrelevant edge 
is one not appearing in any simple s-t path of G. 

Let Gs = (Vs, Eg) be the subgraph of a directed graph G = (V,E) induced by edges 
S C E. If Gs is acyclic and has no irrelevant edges, the domination of Gs is dg = 
(— I ) / ' K ' " 1 : in all other cases, define ds = 0. 

Define fk(n) to be the reliability of a fc-out-of-n success system having the components 
N = {1,2, ... ,n} and corresponding reliabilities p\,pi, . ■ . ,p n - 


Facts: 

1. A parallel system has reliability fi(ri) = 1 — (1 — Pi)(l — P 2 ) ... (1 — p n )- 

2 . A series system has reliability /„(n) = P 1 P 2 ■ ■ ■ Pn ■ 

3. Series-parallel system: If a system is constructed from series and parallel subsystems 
(with no component appearing in more than one subsystem), then the reliability of the 
overall system is calculated by successively applying Facts 1 and 2. 

4 . Applying the factoring approach of §7.6.3 to a fc-out-of-?r success system gives 
fk(n) = p n fk-i( n - 1) + (1 - Pn)fk(n - 1). 

5 . k-out-of-n system: Repeated application of Facts 1, 2, and 4 produces the reliability 
of any fc-out-of-n success system. Since a fc-out-of-?r failure system is the same as an 
(n — k + l)-out-of-n success system, the reliability of any fc-out-of-n failure system is 
found in a similar way. 

6. For a two-terminal undirected network G, the system (N, 4>- e ) corresponds to the 
two-terminal network G — e with edge e deleted. 

7. For a two-terminal undirected network G, the system (N, </>+ e ) corresponds to the 
two-terminal network G/e in which edge e is contracted. 

8. Two-terminal undirected network: Algorithm 1 is a recursive procedure that cal- 
culates R st (G). Based on the factoring approach of §7.6.3, it splits the initial reliability 
calculation for G into calculations for the smaller networks G — e and G/e. 

9. For the sake of efficiency, Algorithm 1 carries out any applicable series and parallel 
reductions before selecting an edge on which to factor. 

10 . To avoid redundant calculations in Algorithm 1, edge e should be chosen from H 
so that H/e and H — e do not contain irrelevant edges. 

11 . If G is a two-terminal directed network, then R st (G) = Yls d S UtesT*' 

12 . The expression in Fact 11 is obtained by using the inclusion-exclusion expansion 
(§7.6.3 Fact 8) applied to the simple s-t paths of G. Remarkably, a number of terms in 
this expansion cancel one another and the remaining coefficients are either +1 or —1. 


© 2000 by CRC Press LLC 




Algorithm 2: All-terminal reliability for undirected networks. 

input: undirected network G 
output: Rv(G) 

let Tj , T 2 , . . . , T m be the spanning trees of G, listed in lexicographic order 
for k := 1 to m do 
S k : = T k 
F k := 0 

for j := 1 to k — 1 do 

F k := F k U min{ r \ r e Tj - T k } 

: = riieS fc Pi riieffc ( k 

Rv(G) :=XT=i 9k 


13. In an undirected network G = (V,E), the minpaths for the all-terminal problem 
are the spanning trees of G. 

14. All-terminal reliability: Algorithm 2 calculates Ry(G) using the disjoint-products 
expansion (§7.6.3 Fact 11), applied to the spanning trees in lexicographic (dictionary) 
order. Here each term of the expansion reduces to a single product involving p, and 
qi = 1 - Pi- 

Examples: 

1. A system on four components is built up from series and parallel subsystems. 
Subsystem A has components 1 and 2 in series and subsystem B consists of com- 
ponent 3. Subsystem C has these two subsystems in parallel and its reliability is 
1 — (1 — pip 2 )(l — P3 ) = P1P2 + P3 — PiP2P3- The entire system is constructed from 
subsystem D (component 4 alone) in series with subsystem C, so it has reliability 
Pi{PlP2 +P 3 ~ P 1 P 2 P 3 ) = P1P2PA + P 3 P 4 - P 1 P 2 P 3 P 4 - 

2. To calculate the reliability of a 2-out-of-3 success system with components 1,2,3, 
Facts 1 and 2 are first used to obtain 

/i( 2 ) = 1 - (1 -pi)(l ~p 2 ) =Pi+P2 -P1P2, /a( 2 ) =PiP2- 

Fact 4 then gives the system reliability 

Rn,4, = / 2 ( 3 ) = 733/1(2) + (1 - p 3 )/ 2 ( 2 ) = P1P2 + P1P3 + P2P3 - ZpiP 2 P 3 - 

3. The two-terminal bridge network G is shown in the following figure, with s = a and 
t = d. 



No series or parallel reductions can be performed on G, so factoring with respect to 
edge e = 3 produces the networks G\ = G/e and G 2 = G — e shown in this figure: 

CirC> 

2 5 

G, G 2 
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Since both G\ and G 2 are series-parallel networks, 

Rst{Gi) = [1 - (1 -pi)(l -_p 2 )][l - (1 -p 4 )(l -P5)] 

= ( Pi +P2- PlP2)(P4 +P5~ P4Pb), 

Rst(G 2 ) = [1 - (1 -pip 4 )(l -P 2 P 5 )} = P1P4+P2P5 -PIP2P4P5- 
Algorithm 1 then produces 

Rst(G) = p 3 (pi +P2- PlP2)(P4 +P5- P4P5 ) + (1 - P 3 ){PlP4 + P 2 P 5 ~ P 1 P 2 P 4 P 5 ) 

= PlP4 + P2P5 + PlP 3 P5 + P2P 3 P4 ~ PlP 3 P4P5 - P2P 3 P4P5 ~ PlP2P 3 P4 
~ PlP2P 3 P5 - P1P2P4P5 + ZP1P2P 3 P4P5- 

4. Consider a directed version of the two-terminal network in the figure of Example 3, 
in which there are oppositely directed edges 3 = (b,c) and 6 = (c, b). Edges 1 and 2 
are directed out of s = a, while edges 4 and 5 are directed into t = d. The cyclic 
subgraph defined by edges S = {1, 2, 3, 4, 5, 6} has ds = 0. Also the subgraph defined by 
S = {1, 2, 3, 4} has the irrelevant edges 2 and 3, so that ds = 0. On the other hand, S = 
{1, 2, 3, 5} defines an acyclic network without irrelevant edges, giving ds = (— 1) 4 ~ 4+1 = 
— 1 and the term —pip 2 p 3 P 5 - Similarly, S = {1,4} produces ds = (— 1) 2_3+1 = +1 and 
the term +piP 4 - 

After generating all acyclic networks without irrelevant edges, Fact 11 is applied to 
obtain 

R st (G) = PlP4 + P2P5 + PlP 3 P5 + P2P4P6 - P1P2P4P5 - P1P3P4P5 

- P 1 P 2 P 4 P 6 - PlP2P 3 P5 - P 2 P 4 P 5 P 6 + PlP2P 3 P4P5 + P1P2P4P5P6- 


5. The bridge network G in Example 3 has eight spanning trees, given in lexicographic 
order by 7\ = {1,2,4}, T 2 = {1,2,5}, T 3 = {1,3,4}, T 4 = {1,3,5}, T 5 = {1,4,5}, 
T 6 = {2,3,4}, TV = {2,3,5}, T 8 = {2,4,5}. Applying Algorithm 2 gives: 


51 = {1,2,4}, 

5 2 = {1,2,5}, 

5 3 = {1,3,4}, 
£4 = {1,3,5}, 


Fi = 0 , 

F 2 = {4}, 
F 3 = { 2 }, 
F 4 = {2,4}, 


9 1 = P 1 P 2 P 4 

92 = PlP2q4Pb 
9Z = Pl92P 3 P4 
94 = Pl92P 3 94P 3 


s 3 = { 2,4,5}, Fs 
Summing these eight terms then yields 


{1,3}, g 8 = qiP29 3 P4P5 


Rv{G) = P1P2P4 + PlP2q4P5 + P\q2P 3 P4 + Piq2P 3 q4P5 H 1- q\P2q 3 P4Pb- 


7.7 DISCRETE-TIME MARKOV CHAINS 

Many physical systems evolve randomly in time, e.g., the population of a country, the 
value of a company’s stock, the number of customers waiting at a checkout counter, 
and the functional state of a machine subject to failures and repairs. A discrete-time 
Markov chain can be used to model such situations when the set of possible states of 
the system is finite (or countable) and the system changes state at discrete time points. 
Such Markov chain models find applications in diverse fields, such as biology, inventory, 
production, queueing systems, and demography. In addition, many recursive algorithms 
can be viewed as a manifestation of an underlying discrete-time Markov chain. 
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7.7.1 MARKOV CHAINS 


Definitions: 

A sequence of random variables { X n \ n > 0 } is a ( discrete-time ) Markov chain 
( DTMC ) on a (countable) state-space S if X n G S for all n > 0 and X n+ \ depends 
(probabilistically) on the previous states of the system only via X n : 

-P^(A n _|_r = j | X n = i , X n —\ — i n — i, ..., Ao — fo) = P^{X n ^i = j | X n i), 

for all i 0 ,ii, G S. 

A Markov chain { X n \ n > 0 } is time-homogeneous if 

Pr(X n+ 1 = j | X n = i) = pij, for all n > 0. 

Note: Only time-homogeneous discrete-time Markov chains will be considered in this 
and later sections. 

The matrix P = (pij) is the ( one-step ) transition probability matrix of the 
discrete-time Markov chain. 

The initial distribution for a DTMC is the vector a = (a,), where a* = Pr(X q = i) 
for i G S. 

The transition diagram of a DTMC is the directed graph (§8.3.1) G = (V,E), where 
V = S is the state-space and E = { (i,j) G S x S \ pij > 0 }. 

A stochastic matrix M = ( m,j ) has nriij > 0 for all i,j and JT rn i: j = 1 for all i. 

Facts: 

1. The first systematic study of Markov chains was carried out by Andrei Andreevich 
Markov (1856-1922); this work initiated the study of stochastic processes (sequences of 
random variables). 

2. A DTMC on state-space S is completely described by the initial distribution a and 
the transition probability matrix P. 

3. The transition probability matrix P is a stochastic matrix. 

Examples: 


1. Consider a DTMC on the set 

S = 

11,2,3,4,5,6} 

with 

the following transition 

probability matrix 









/ 0.4 

0.6 

0 

0 

0 

o \ 


0.7 

0.3 

0 

0 

0 

0 


D 

0 

0 

0 

1 

0 

0 


1 — 

0 

0 

1 

0 

0 

0 



0 

0 

0 

0 

1 

0 



\0.1 

0.1 

0.1 

0.1 

0.1 

0.5/ 



To completely describe this DTMC, it is also necessary to specify the initial distribution. 
For example, a = (0, 0, 0, 0, 0, 1) means that the system starts off in state 6. 

2. Simple random walk with absorbing states: This is a DTMC on S = {0, 1, 2, ..., N} 
with transition probabilities Pi t i+ 1 = P , Pi,i- 1 = 1— p = q, i<i<N — 1, where 
0 < p < 1 is a given number. Also, po,o = Pn,n = 1, meaning that states 0 and N are 
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absorbing — once the DTMC visits these states it cannot leave them. The transition 
diagram of this DTMC is given in the following figure. 

This Markov model is also (more colorfully) known as the gambler’s ruin problem 
(§7.5.1, Example 2). 



q q q 


p p 



3. Simple random walk with reflecting states: This is a variant of Example 2, in which 
the boundary states 0 and N are reflecting: namely, po,i = Pn,n - i = 1- The transition 
diagram of this DTMC is given here. 


1 P P 



q q q 


p p 


q i 



4. Weather: A simplified model of the daily weather results in a DTMC. Suppose 
that each day is either sunny (0) or rainy (1) and that tomorrow’s weather depends 
only on today’s weather. Specifically, suppose that a rainy day follows a sunny day 
with probability 0.3 and a sunny day follows a rainy day with probability 0.4. This is 
a DTMC with state-space S = {0, 1} and transition probability matrix 

f 0.7 0.3 
F 0.4 0.6 


5. Urns: Urn B contains 9 black and 1 white ball, while Urn R contains 6 red and 4 
white balls. Balls are successively drawn with replacement from an urn. If the ball 
drawn is colored, the drawing continues from the same urn. If the ball drawn is white, 
the drawing continues from the other urn. Define the state of the system to be the urn 
being sampled, so S = {B, R}. This is a DTMC with transition probabilities pbb = 0.9, 
Pbr = 0.1, prr = 0.6, prb = 0.4. 

6. Ehrenfest diffusion model: Suppose that there are M molecules in a vessel, sep- 
arated into two chambers by a membrane, across which molecules can pass. A state 
of the system at any instant is given by (fci,fc 2 ), where there are k\ molecules in the 
first chamber and fc 2 = M — k\ in the second chamber. Transitions from the current 
state (fci,fc 2 ) occur by the movement of a single molecule from the first chamber to 
the second, resulting in state (k\ — 1, fc 2 + 1), or from the second chamber to the first, 
resulting in state {k\ + 1, fc 2 — 1). 

In the Ehrenfest model of this process, the probability of transition from (k \ , fc 2 ) 
to (hi — 1 , k 2 + 1) is given by jj, whereas the probability of transition to (k\ + 1 , fc 2 — 1 ) 
is = 1 — This quantifies the idea that if more molecules are present in (say) 
chamber 1 , then it is more likely for some molecule to transfer next from chamber 1 to 
chamber 2. This is a DTMC with state-space S = {(0, M), (1, M — 1), . . . , (M, 0)} and 
the transition probabilities specified. 
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7.7.2 TRANSIENT ANALYSIS 


Transient analysis of a DTMC involves the computation of Pr(X n = j), the probability 
of the Markov chain being in state j after n steps. 

Definitions: 

(n) 

For i,j £ S the n-step transition probability p\P is the probability of being in state j 
after n > 0 steps, if the Markov chain starts in state i: p\™' > = Pr(X n = j \ Xq = i). 

The n-step transition probability matrix is given by P (n '> = {pf^)- 

Facts: 

1. Pr(X n =j)=E Pr(X o = i)Pr(X n = j \ X 0 = i) = £ a iP [f . 

i£S ies 

2. Chapman-Kolmogorov equations: p( n+m ) = p( n )p( m ) = p( m )p(«) for all m,n > 0. 

3. If P n denotes the nth power of P , then P ^ = P n , n > 0. 

4. If a is the initial distribution of a DTMC, the (absolute) probabilities Pr(X n = j) 
are the entries of the vector aP n . 


Examples: 

1. For §7.7.1 Example 4, the two-step transition probability matrix is 


p(2) _ p2 _ 


0.7 0.3 W 0.7 0.3 \ / 0.61 0.39 
0.4 0.6 J ^0.4 0.6 J ~ \0.52 0.48 

Note that P ^ is again a stochastic matrix. To illustrate, if Friday is sunny then the 

(2) 

conditional probability that Sunday is sunny is given by p y 00 ’ = 0.61. 

2. A general two-state DTMC on S = {0, 1} can be represented by the stochastic 
transition probability matrix 

1 ~P P 

Direct calculation gives the two-step transition probability matrix 

P (2) = (0- -p) 2 +pq p(2-p-q) 

\q(2-p-q) (1 -q) 2 +pq, 


P = 


which can be rewritten as 


In general, 


p(2) = 

p+q 


p(n) _ 

p+q 


q+p(l-p-q) 2 
q - q(l -p- q) 2 

q + p(l-p-q) n 
q-q(l-p-q) n 


p-p(l-p- q) 2 
p + q{l-p-q) 2 

p-p(l-p-q) n 

p+q(l-p-q) n 


3. Limiting probabilities: Suppose in Example 2 that 0 < p < 1 and 0 < q < 1, so that 
|1 — p — q\ < 1. From the final expression obtained in Example 2, it is seen that P^ 
tends to the limiting matrix 


l q P 

p+q \q p 


Consequently, if a 
ties aP n approach 


( 01 , 02 ) is any initial distribution, then the limiting probabili- 

«>(; :) 


since 01 + 02 = 1. For example, the limiting probability of being in state 0 is 
independent of the initial state of the DTMC. (See §7.7.4.) 
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7.7.3 CLASSIFICATION OF STATES 


Definitions: 

State j £ S is accessible from state i £ S (written i — > j ) if it is possible to make a 
sequence of transitions leading from state i to state j: that is, pff > 0 for some n > 0. 

States i,j £ S communicate (written i <-> j) if they are mutually accessible from one 
another: i — > j and j — > i. 

Set C C S is a (maximal) communicating class if 

• i,j £ C => i j; 

• i £ C, i <-> j => j £ C. 

A communicating class C is closed if transitions from the states of C never lead to 
states outside C : i£C,j^C=>j is not accessible from i. 

A DTMC is irreducible if i <-> j for all i. j £ S', otherwise it is reducible. 

For j £ S define: 

• Tj = min{ n > 0 | X n = j }; 

• f j = Pr(Tj < oo | X 0 = j); 

• fj ( n ) = Pr(Tj = n\X 0 = j); 

• nij = E(Tj | X 0 = j). 

State j £ S is recurrent if return to that state is certain: fj = 1; if fj < 1 then state j 
is transient. A recurrent state j £ S is positive recurrent if rrij < oo and null 
recurrent if m.j = oo. 

A recurrent state j has period d if d is the largest integer satisfying fj( n d) = 1- 

If d = 1 state j is aperiodic. 

Facts: 

1. Generally, all classes that are not closed can be lumped into a single set T of tran- 
sient states. Thus, the state-space of a DTMC can be partitioned into closed classes 
C\,C' 2 , ■ ■ ■ , Ck and the set T. 

2. State j is accessible from state i if and only if there is a directed path from vertex i 
to vertex j in the transition diagram of the DTMC. 

3. A set of states C is a communicating class if and only if the corresponding set of 
vertices forms a strongly connected component (§8.3.2) in the transition diagram. 

4. Tarjan [Ta72] describes an algorithm to find the strongly connected components, 
which runs in time linear in the number of arcs in the transition diagram (i.e., the 
number of nonzero entries of P). 

5. Transience, positive recurrence, and null recurrence are class properties: 

• if i is transient and i j then j is transient; 

• if i is positive recurrent and * <-> j then j is positive recurrent; 

• if i is null recurrent and i j then j is null recurrent. 

In other words, states in a communicating class are all simultaneously transient or null 
recurrent or positive recurrent. 

6. By Fact 5, a communicating class or an irreducible DTMC can be termed positive 
recurrent, null recurrent, or transient if all of its states are positive recurrent, null 
recurrent, or transient. 

7. A finite communicating class is positive recurrent if it is closed, and transient oth- 
erwise. 
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8 . A finite state irreducible DTMC is positive recurrent. 

9. Null recurrent states do not occur in a finite state DTMC. 

10. Establishing recurrence or transience in an infinite state DTMC is a difficult task 
and has to be done on a case-by-case basis. 

1 1 . We have not defined period for a transient state since for such a state the concept 
is not needed. Some references do however define period for all states. 

12. The period of state j is the greatest common divisor of all integers n > 0 such that 
„<“> > 0. 

13. The period of state j is the greatest common divisor of all the lengths of the 
directed cycles in the transition diagram that contain state j. 

14. Periodicity is a class property: if i has period d and i <-> j, then j has period d. 

15. The period of a state in a finite irreducible DTMC is at most equal to the number 
of states in the DTMC. 

16. By Fact 14, a recurrent communicating class or a recurrent irreducible DTMC can 
be termed periodic if all states in it are periodic with d > 1 , else it is termed aperiodic. 

Examples: 

1. For the DTMC in §7.7.1 Example 1, it is seen that 1 — > 2, 2 — > 1, and 1 <-> 2. 
However, 3 is not accessible from 1. The communicating classes are C\ = {1,2}, 62 = 
{3,4}, 63 = {5}, 64 = { 6 }. This DTMC is reducible. Classes 61 , 62,63 are closed, 
but 64 is not. States 1,2, 3, 4, 5 are positive recurrent and state 6 is transient. Classes 
61 , 62,63 are positive recurrent. 

2. Consider the random walk in §7.7.1 Example 2 with 0 < p < 1. There are three 
communicating classes: 61 = {0}, 62 = {1,2,..., A - — 1}, and 63 = {N}. This DTMC 
is reducible. Here 61 and 63 are closed, while 62 is not. States 0 and N are positive 
recurrent. The rest are transient. 

3. For the DTMC in §7.7.1 Example 1, states 3 and 4 have period 2; states 1, 2, and 5 
are aperiodic. A period is not associated with state 6 since it is transient. Classes {1, 2} 
and {5} are aperiodic, while the class {3,4} is periodic with period 2. 

4. For the DTMC in §7.7.1 Example 2, states 0 and N are aperiodic. Period is not 
defined for the rest of the states as they are transient. 

5. §7.7.1 Example 3 is an irreducible chain. All states are positive recurrent and have 
period 2 . 


7.7.4 LIMITING BEHAVIOR 

To establish possible equilibrium configurations of DTMCs, it is necessary to study the 
behavior of the n-step transition probabilities Pr(X n = j \ X 0 = i) as n — > 00 . 

Facts: 

1. Let { X n | n > 0 } be an irreducible DTMC with transition probability matrix P 
and finite state-space S. Then there exists a unique solution 7 r = (7 Tj) to the equations 

7T = TrP, J] 7 Tj = 1 . 

jes 

2. The long run fraction of the visits to state j is given by 7 Tj, regardless of the initial 
state. Also, rrij, the expected time between two consecutive visits to state j, is 

3. If the DTMC is aperiodic, then lim^oo Pr(X n = j \ Xq = i) = 7 Tj for all i € S. 
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4. Let { X n | n > 0 } be a finite state reducible DTMC with K closed communicating 
classes C\, C 2 , - ■ ■ , Ck and the set of transient states T. Then lim^oo Pr(X n = j \ 
Xo = i) = 7 Tij, where {it ij} are the following: 

(a) If j £ T, then tt^ = 0. 

(b) If i and j belong to different closed classes, then 7 = 0. 

(c) If i and j belong to the same closed class C r , then tt^ = 7 Tj, where { 7 Tj} are 
the limiting probabilities calculated by using Fact 1 for the irreducible DTMC 
formed by the states in C r . 

(d) if i £ T and j £ C r , then n.ij = ai r nj, where { 71 © are as in (c) and Oj r is the 
probability that the DTMC eventually visits the class C r starting from state i. 

5. In Fact 4(c), if C r is periodic then limiting probabilities do not exist and nj is 
interpreted as the long run fraction of the time the DTMC spends in state j starting 
from state i. 

6 . The {ctjr} in Fact 4(d) are given by the unique solution to 

fi ir — Pij T PijOCj r , 
j£C r j 6T 


Examples: 

1. The DTMC in §7.7.1 Example 3 is irreducible and periodic with d = 2. Using Fact 1, 
its limiting behavior is described by the equations 

77 0 = <?7Tl 
TTl = 7T 0 + qTT 2 

TTi = piTi-i + qTTi+i for 2 < i < N — 2 
ttn— 1 = Pttn-2 + ttn 
77 N = PTTN-1 

and X)^. 0 77j = 1. Solving these equations gives TTj = where 

Po = 1 

(P)7 

Pi = — — for 1 < j < N — 1 
P 

(p\ N ~ l 

pN = y 

and the normalizing constant is c = X)y=o Pj- Tins DTMC is periodic and hence 
these 7 Tj represent the long run fraction of the time the DTMC spends in state j. 
Here linin^oo Pr(X n = j \ Xq = i) does not exist since the probabilities under question 
keep oscillating with period 2. 


2. For the DTMC in §7.7.1 Example 1, C 1 = {1,2}, C 2 = {3,4}, C 3 = {5}, and 
T = {6}. Therefore, 77i = 7 t 2 = 7r 3 = |, 774 = 77 5 = 1, a 6 i = |, a 62 = |, 

063 = 5 ■ By Fact 4 the limiting matrix ( 77 ij) is given by 

(h h 0 0 0 0 \ 

h u 0 0 0 0 

0 0 I \ 0 0 

0 0 \ 4 0 0 

0 0 0 0 1 0 
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States 3 and 4 are periodic, and hence the third and fourth columns need to be inter- 
preted as the long run fraction of the time the discrete-time Markov chain spends in 
those states. 

3. The Ehrenfest diffusion model (§7.7.1 Example 6) is an irreducible DTMC with 
period 2. The solution ir (Fact 1) is given by i Xj = (^) 2 _M (0 < j < M). The binomial 
distribution (§7.3.1) describes the long run fraction of time the system spends in each 
state. 


7.7.5 FIRST PASSAGE TIMES 


Definitions: 

Let { X n | n > 0 } be a DTMC on state-space S with transition probability matrix P. 
Let A C S be a given subset of states. 

The first passage time Ta into set A is the first time at which the Markov chain 
reaches some state in set A; i.e., Ta = min{ n > 0 | X n £ A }. 

For i £ S, let on = Pt{Ta < oo | Xo = i) and let r* = E(Ta \ Xq = i). 

Facts: 

1. The {ctj} are given by the unique solution to 

O / — P i / OLj 

jes 

with the boundary conditions = 1 if i £ A and a* = 0 if no state in A is accessible 
from i. 

2. If cq = 1 for all i £ S, then {r^} are given by the unique solution to 

Ti = 1 + J2 Pij T j 
j&S 

with the boundary condition r,; = 0 if i £ A. 


Examples: 

1. Consider the DTMC in §7.7.1 Example 2 with A = {0}. The equations of Fact 1 are 
a* = qai-i -t-pcq+i, 1 < i < N — 1 with the boundary conditions ao = 1 and = 0. 
If q yf p, the solution is given by 

(«y_( i) N 

OLi = , o <i<N. 

If q = p, the solution is ct; = 1 — X. 


2. Consider the DTMC in §7.7.1 Example 2 with A = {0,1V}. In this case on = 1 for 
all i. The equations of Fact 2 are r* = 1 + qn - 1 + pn + 1 , 1 < i < N — 1, with the 
boundary conditions tq = 0 and rjv = 0. If q ^ p, the solution is given by 


If q 


_ i N_ ^(p)* 

1 q-p q-p i-(f ) w ’ 

p, the solution is given by t* = i(N — i). 


0<i<N. 
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7.7.6 BRANCHING PROCESSES 


Branching processes are a special type of Markov chain used to study the growth (and 
possible extinction) of populations in biology and sociology as well as particles in physics. 

Definitions: 

Suppose { Y n i | n, i > 1 } are independent and identically distributed random variables 
having common probability distribution function p^ = Pr(Y n i = k), k > 0, with mean m 
and variance cr 2 . Then the DTMC { X n \ n > 0} is a branching process if X 0 = 1, 

X n+ 1 = E Y ni . 

A branching process is stable if m < 1, critical if m = 1, and unstable if m > 1. 

The extinction probability of a branching process is the probability that the popu- 
lation becomes extinct, where Xq = 1. 


Facts: 

1. It is convenient to think of the random variable X n as the number of individuals in 
the nth generation and the random variable Y n i as the number of offspring of the ith 
individual in the nth generation. 

2 . The transient behavior of the branching process is given by: 

E{X n ) = to", 


i ncr 2 , if to = 1 

(7 TO , if TO A 1. 

m — 1 

3. State 0 is absorbing for a branching process. Absorption in state 0 is 
only if to < 1 , while the expected time until extinction (i.e., absorption 
finite if and only if to < 1. 


certain if and 
in state 0) is 


4 . The probability of extinction p in an unstable branching process is given by the 
unique solution in (0, 1) to the equation 

OO 

P = E PnP n ■ 

n— 0 


5 . The expected total number of individuals ever born in a stable branching process 
until it becomes extinct is given by 


OO 


E( E X n ) 

n = 0 


l 


1 — m 


6. There is no simple expression for the expected time until absorption for a general 
stable branching process. 


Examples: 

1. The branching process with po = Pi = P 2 = § has mean to = ^ < 1 and is 
stable. With probability 1, the population will die out. 

2 . The branching process with Po = \, Pi = \, P 2 = \ has mean m = | > 1 and is 
unstable. The probability of extinction, p 0 , is found as the smallest positive root of the 
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equation p = \ + \p + \p 2 ■ The roots of this equation are | and 1, so the probability 
of extinction is po = r;. 

If the initial population is Xo = 10 instead of Xo = 1, then the probability that 
the initial population eventually becomes extinct is pj 0 = 


7.8 QUEUEING THEORY 

Queueing theory provides a set of tools for the analysis of systems in which customers 
arrive at a service facility. It has its origins in the works of A. K. Erlang (starting 
in 1908) in telephony. Since then it has found many applications in diverse areas such 
as manufacturing, inventory systems, computer science, analysis of algorithms, and 
telecommunications. Although queueing theory uses the terminology of servers pro- 
viding service to customers, in actual applications the customers may be people, jobs, 
computational steps, or messages, and the servers may be human beings, machines, 
telephone circuits, communication channels, or computers. 


7.8.1 SINGLE-STATION QUEUES 

The simplest queueing system is a single-station queue in which customers arrive, wait 
for service, and depart after service completion. In this and subsequent sections we 
restrict ourselves to single-station queues. 

Definitions: 

A queueing system consists of a set of customers , who arrive at a service facility 
according to a specified arrival process. If a server is available then the customer is 
served immediately, with the length of time required to carry out the service determined 
by a service-time distribution. If a server is not free, the customer joins the queue 
and is later served according to a service discipline, which specifies the order in which 
customers are selected for service from the queue. Throughout, the service discipline is 
assumed to be First-Come-First-Served (FCFS). Alternative service disciplines include 
Last-Come-First-Served, randomly, or according to a tiered priority scheme. 

The queue capacity is the maximum number of customers allowed in the system, 
either being served or awaiting service. Unless otherwise specified, the queue capacity 
is assumed to be infinite. 

In a single-station queueing system, customers arrive, wait for service, and depart 
after service completion. 

An exponential distribution with parameter A is a density function (§7.3.1) having 
the form f(x ) = \e~ Xx for x > 0. 

An arrival process is Poisson if the interarrival times (times between successive arrivals) 
are independent and identically distributed exponential random variables. 

A random variable has an Erlang distribution with phase parameter k if it is the sum 
of k > 1 independent and identically distributed exponential random variables. 

Facts: 

1. If the random variable X has an exponential distribution with parameter A, then 
E(X) = j and Var(X) = 
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2. If the arrival process is Poisson with parameter A, then the number of customers 
arriving in an interval of time of length x is a Poisson random variable (§7.3.1) with 
parameter \x. 

3. The Erlang distribution is a special type of gamma distribution (§7.3.1). 

4. Kendall’s notation : A single-station queueing system is described by the 5-tuple: 
interarrival-time distribution/service-time distribution/number of servers/waiting room 
capacity /service discipline. 

5. The following symbols are standard in describing queueing systems according to the 
scheme in Fact 4: 

• M — exponential (M for Memoryless); 

• Ek — Erlang with phase parameter k ; 

• D — deterministic (constant); 

• G — general. 

6. More complicated queueing systems can consist of networks of queues, multiple types 
of customers, and priority schemes. 

7. A World Wide Web site that provides a list of over 100 books on queueing theory 
can be found at the site: 

http : //supernova . uwindsor . ca/people/hlynka/ qbook . html 

8. A compilation of queueing theory software can be found at the site: 

http : //supernova . uwindsor . ca/people/hlynka/ qsof t . html 


Examples: 

1. A single-station queueing system is depicted by the schematic diagram in the follow- 
ing figure. Here customers randomly join the system (according to the arrival process), 
wait for service in the waiting room, are served (which takes a random amount of time), 
and then depart from the system. 


arrivals 


waiting room 


O departures 

* 



2. An M/G/3/10/LCFS system has Poisson arrivals (exponential interarrival times), 
general service times, three servers, room for ten customers (including those in service), 
and Last-Come-First-Served service discipline. 

3. An M/M/1 queue has Poisson arrivals, exponential service times, a single server, 
infinite waiting room, and FCFS service discipline. 

4. Airplane landings: The landing of aircraft at an airport can be viewed as a queueing 
system in which the aircraft are the customers and the runways are the servers. Aircraft 
arrive according to a certain stochastic arrival process, and the length of time to land 
follows a certain service-time distribution. Those aircraft that are unable to land must 
join the queue of circling aircraft, awaiting service (normally according to a FCFS 
discipline, except in the case of an emergency landing, which would be a type of priority 
scheme) . 
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5. Communication network: Messages arrive according to a Poisson process at rate A 
per second and are to be transmitted over a particular data link. The time required to 
transmit a single message is exponentially distributed, with average duration A seconds. 
Messages waiting to be sent are stored in an input buffer. If the buffer has infinite 
capacity, then this system is an M/M/1 queue. If the input buffer has finite capacity c, 
then this system is an M/M/1 /c queue. 

6. Banking: Customers arriving at a bank form a single common queue and are served 
by the s available tellers in FCFS order. If arrivals are Poisson and the length of time 
to service a customer is exponential, this system can be modeled as an M/M/s queue. 

7. Remote computing: A computer center has c dial-up telephone lines. Users can 
dial into the central computer from their remote devices using any of the lines. (Calls 
roll over to an available line if one is free.) If arrivals are Poisson and the length of time 
spent online follows an arbitrary distribution, then this is an M/G/c/c queue. It is also 
known as a loss system, since any calls to the central computer receive a busy signal 
when all servers (lines) are occupied, and hence these calls are “lost” to the system. 


7.8.2 GENERAL SYSTEMS 

This section presents results applicable to single-station queues with general arrival 
patterns, service-time distributions, and queue disciplines. 

Definitions: 

For a single-station queueing system, define: 

• A n = the arrival time of the nth customer to the system; 

• S n = the service time of the nth customer; 

• D n = the departure time of the nth customer; 

• A(t) = the total number of arrivals up to and including time t; 

• D(t) = the total number of departures up to and including time t ; 

• X(t) = the total number of customers waiting in the system at time t. 

The stochastic process { X(t) \ t > 0 } is the queue-length process. 

The state distribution following an arbitrary departure is 7 r,- = lim Pr(X(D+) = j), 

n—> oo 

for j > 0. 

The state distribution prior to an arbitrary arrival is 7r* = lim Pr(X(A~) = j), for 

J n—* oo 

j > o. 

The state distribution at an arbitrary time point, or steady-state distribution, is 
Pj = lim Pr(X(t) = j), for j > 0. 

t — >-oo 

The queue-length process (or the queueing system) is stable if the steady-state distri- 

OO 

bution {pj | j > 0 } exists and ^ pj = 1. 

3=0 

The waiting time of the nth customer is W n = D n — A n : it includes the service time. 

n 

The steady-state expected waiting time is W = lim - ^2 Wk, if the limit exists. 

n *°° n fc= l 

The long-run arrival rate is A = lim " . if the limit exists. 

n—> oo 

The steady-state expected number in the system is L = lim | f*X(u)du, if the 
limit exists. 
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Facts: 

1. The number of customers in the system at any time equals the total number of 
arrivals up to that time minus the number of departures up to that time: that is, 
X(t) = A(t) — D{t) for t > 0. 

2. A sample path of the queue-length process { X(t) \ t > 0 } is piecewise constant, 
with upward jumps at points of arrival and downward jumps at points of departure. 

3 . Suppose all the jumps in the sample paths of { X(t) \ t. > 0} are of size ±1, with 
probability 1. If either nj or tt* exists, then tt j = t r* for all j > 0. 

4. PASTA ( Poisson Arrivals See Time Averages ): If { A{t) \ t > 0 } is Poisson and, 
for every s > 0, { A{t) \ t > s } is independent of { X(u) | 0 < u < s }, then pj = tt* for 
all j > 0. 

5. Little’s Law (J. D. C. Little, 1961): L=XW. 

Examples: 

1. Suppose arrivals to a system occur deterministically, every 3 minutes, and service 
times are deterministic, each taking 2 minutes. Since every arriving customer is served 
immediately, either X(t) = 0 (no customers) or X(t) = 1 (a single customer). Every 
arrival finds an empty system and every departure leaves an empty system: 7To = 1 = 7rg, 
7Ti = 0 = 7rJ , as required by Fact 3. (The steady-state distribution does not exist.) 

2 . On average A = 24 customers per hour arrive at a copy shop. Typically, there are 
L = 9 customers in the store at any time. Using Little’s law, W = k = 0.375 hour so 
that each customer spends on average 0.375 hours (or 22.5 minutes) in the shop. 

3 . The steady-state queue length or waiting time in a queueing system can be reduced 
by increasing the service rate. Suppose the long-run arrival rate doubles, but the service 
rate is increased so that the steady-state expected waiting time remains the same. Then 
by Little’s law the steady-state expected queue length will also double. 

4. Machine repair: A single machine is either working or being repaired. Suppose that 
the average time between breakdowns is exponentially distributed with mean ( and the 
time to repair the machine is is exponentially distributed with mean jj . This is then 
an Ah/M/1/1 queueing system with a single customer, corresponding to a broken down 
machine. X(t) = 0 signifies that the machine is working, and X(t) = 1 signifies that the 
machine is being repaired. A sample path of this system is shown in the following figure, 
with the machine initially working. Over a long period of time, after N breakdowns and 
subsequent repairs, the machine is working for iV(y ) units of time and is being repaired 
for N(-) units of time. The long run proportion of time the machine is working is then 

g(j) _ fj 
N Cx) + N (l) X + h 

This value also turns out to be the steady-state probability of finding the system in 
state 0, with the machine working. 
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7.8.3 SPECIAL QUEUEING SYSTEMS 

This section summarizes analytical results about special types of single-station queues. 

Notation: 

• | = expected interarrival time; 

• A(-) = Laplace transform of the interarrival-time density; 

• 77 = expected service time; 

• B(-) = Laplace transform of the service-time density; 

• a 2 = variance of the service time; 

• s = number of servers; 

• p = 7^7 = traffic intensity. 

The probability generating function (§7.3.3) for the steady-state distribution {pj } of a 

OO 

queueing system is <t>{z) = PjZ ^ , \z\ < 1 . 

3=0 

The Laplace transform for the waiting-time distribution f(w) of a queueing system is 
= / 0 °° e~ sw f(w) dw, Re(s) > 0. 

Facts: 

1. The M/M/1 queue is stable if p < 1. The following results hold when the queue is 
stable: 

Pj = (! - PV = M 


2. The M/M/l/K queue is always stable. Assume p ^ 1. 

Pj = i-p^+i P^ 0 < 3 < K 
*i = T?k = *i’ 0<j<K-l 

L= T ^ p ( 1 ^-Kp K ) 

W =nh{l 5 £t- k Pk). 

If p = 1, the above formulas reduce to: 

Pj ~ TP+i 1 K 

T* = Hj, = / c , 0 < j < K - 1 

T _ K 
L ~ 2 

vv 2/i ’ 
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3. The M/M/s queue is stable if p < 1. The following results hold when the queue is 
stable: 

1 

pi = { T if>0 ' 0ij< ‘ 

{ j > S 

TT j =TT*=p J , j> 0 

L = p(>+n?w) 

W=i(s+^) 

^(number of busy servers) = p. 

4. The M/M/ oo queue is always stable: 

Pj = e~ (A/M) {x, ] / )3 , j> 0 

nj = = pj, j> 0 

L = - 

5. The M/G/l queue is stable if p < 1. The following results hold when the queue is 
stable: 

Po = 1 - P, 


*(*) = (! 

^(g) = (1 - p) s _ A (f% } ) 

2 I \ 2 2 

7" — n I P + A a 

h ~ P + 2(1 — p) 

W 1 I •M(1/p) 2 +°’ 2 ) 

VV ~ M + 2(l-p) ■ 

The last four equations are the various forms of the Pollaczek-Khintchine formula. No 
closed form results are available for M/G/c queues for 2 < c < oo. 

6. The M/G/c/c queue, also called a loss system , is always stable. The main result is: 

Pj = f b ' , 0 < j < c. 

J2 P" /«! 


7. The M/G/oo queue is always stable: 


Pj = e j > o 

■Kj = IT* = Pj, j> 0 

L = - 
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8. The G/M/l queue is stable if p < 1. When the queue is stable there is a unique 
solution a € (0, 1) to a = A(p( 1 — a)). The following results hold when the queue is 
stable: 

7Tj — (1 — a )<© =7 Tj, j > 0 


Po = 1 - P 

Pj = 3 > 1 


L = 


1 — 0 : 


W = 


i i 
A» (1 — a) ' 


The G/M/c queue can be analytically solved, but the results are complicated. 


9. The G/M/oo queue is always stable: 

L 


A 

A* 


variance of number in system 


,A 1 1 ,A \ I -U M d( At) 

All M 2 I-A(ai) ' 


Examples: 

1. At a drop-in legal clinic, the lawyer sees four clients during a typical (eight hour) 

day. Each client’s case requires on average 1.5 hours of the lawyer’s time. If arrivals are 
Poisson and service times are exponentially distributed, then this is an M/M/1 queue 
with A = | = \ customers per hour and Here p=A = |<l, so the queue is 

stable. Using Fact 1, po = 1 — p = so there is probability \ that the lawyer is idle. 
The expected number of clients in the clinic is L = = 3 and the average wait of a 

client is W = = 6 hours. 

fj — A 

2. Customers arrive at a service station according to a Poisson process with rate 10 
per hour. The manager has two options: (a) employ a single fast server who can service 
a customer in 5 minutes on average, or (b) employ two slow servers each taking 10 
minutes on average to serve a customer. Assume that the service times are exponential. 
Which option should the manager implement to minimize the expected waiting time in 
steady state? 

Under (a) the system is an M/M/1 queue with A = 10, p = 12. Since p = < 1, 

the system is stable. By Fact 1, W = 12 ^ 1Q = 0.5 hours. Under (b) the system is an 
M/M/2 queue with A = 10 and /.t = 6. The system is stable since p = AE < 1. From 
Fact 3, po = jj, P 2 = and W = () = 0.55 hours. Thus option (a) is better. In 
general, it is better to employ a few fast servers than many slow servers with the same 
overall service capacity. 

3. A system manager has the option of using one of three possible servers in a single- 
server system. The service times under the first server are exponential with mean of 6 
minutes. Under the second server they are uniformly distributed over [4,8] minutes. 
Under the third they are constant, equal to 6 minutes. The customers arrive according 
to a Poisson process with rate 8 per hour. Which server should be chosen to minimize 
the expected waiting time in steady state? 

The mean service time is 6 minutes; i.e., p = 10 per hour, for all three servers. 
However, the variances cr 2 are different. This M/G/l system is stable under all three 
servers since p = yg < 1. For server one, a 2 = 0.01 (hours) 2 and W = 0.5 hours. For 
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the second server, a 2 = = 0.000370 (hours) 2 and W = = 0.31 hours. For the 

third server, cr 2 = 0.0 (hours) 2 and W = = 0.3 hours. Thus, it is best to use the 

server with constant service times. In general, reducing the variance of the service times 
has a beneficial effect on the system. 

4. A small business wants to install a telephone system with multiple lines, though 
without any capacity for call queueing. This is to be done to ensure that 95% of the 
calls made to the business get answered. Suppose that the arrival process is Poisson with 
A = 10 calls per hour, and that the average call lasts 5 minutes. This is an M/G/c/c loss 
system (Fact 6), and it is necessary to find the smallest value of c such that p c < 0.05. 
Using Fact 6 with p = = | and c = 1 gives p± = = 0.45. Similar calculations 

give P 2 = 0.16 for c = 2 and p 3 = 0.042 for c = 3. Consequently, three lines are needed 
to ensure the stipulated grade-of-service requirement. 


7.9 SIMULATION 

Simulation is a technique for numerically estimating the performance of a complex 
stochastic system when analytic solution is not feasible. This section discusses both 
discrete-event and Monte Carlo simulation. In discrete-event simulation models, the 
passage of time plays a key role, as changes to the state of the system occur only at 
certain points in simulated time. For example, queueing and inventory systems can be 
studied by discrete-event simulation models. Monte Carlo simulation models do not, 
however, require the passage of time. Such models are useful in estimating eigenvalues, 
estimating 7 r, and estimating the quantiles of a mathematically intractable test statistic 
in hypothesis testing. Simulation has been described [BrEtal87] as “driving a model of 
a system with suitable inputs and observing the corresponding outputs.” Accordingly, 
the following three subsections discuss input modeling, output analysis, and simulation 
programming languages. 


7.9.1 INPUT MODELING 

This section addresses three key issues in constructing a simulation model: 

• determining a source of randomness to drive the probabilistic aspects of the 

model; 

• input model selection to determine the appropriate probabilistic models to drive 

the simulation; 

• random variate generation algorithms that transform random numbers to random 

variates. 

Definitions: 

Random numbers are real numbers generated uniformly over the interval (0, 1). 

A random number generator is any mechanism or algorithm for generating random 
numbers. 

Pseudo-random numbers are values generated deterministically, but that appear to 
behave like independent and identically distributed random numbers. 


© 2000 by CRC Press LLC 



Let to be a large prime integer. A purely multiplicative linear congruential gen- 
erator (§4.3.1) produces a stream of pseudo-random numbers { — | i > 1 } based on the 
recursive relationship Xi+i = aXi mod m, where a is an integer multiplier between 1 
and m — 1, and Xq is an integer seed between 1 and to — 1. 

An input model characterizes the stochastic elements of a discrete-event simulation 
model. 

A trace-driven input model generates a process that is identical to the collected data 
values without relying on a parametric model. 

A random variate is a realization of a random variable. 

Facts: 

1. Stochastic simulations typically derive their source of randomness from random 
numbers. That is, inputs to the simulated system need to be generated according to a 
specified probability model, a task that can be accomplished by suitably transforming 
(uniform) random numbers. 

2. Desirable properties for random number generators include: uniformity, indepen- 
dence, speed, minimal memory requirements, ease of implementation, portability across 
various computer systems, reproducibility, and multiple stream capability. 

3. Although numerous methods have been proposed for generating random numbers, 
multiplicative linear congruential generators are typically used to produce a stream of 
pseudo-random numbers. 

4. Due to the prevalence of 32-bit computer architecture, to is often chosen to be 
2 31 — 1, which is prime. 

5. A full period generator, which cycles through all to — 1 possible Xi values prior to 
repeating, is obtained by selecting a to be a primitive root modulo to. (See §4.7.1.) 

6. Software for pseudo-random number generators can be found at the sites: 

• http : //random. mat . sbg . ac . at/others/#MCSoftware 

• http : / / www . taygeta . com/random . html 

• http : //www. isye . gatech . edu/inf orms-sim/#ware 

7. Additional information on the theoretical and empirical performance of a variety of 
pseudo-random number generators is available at: 

• http : //random. mat . sbg . ac . at /generators/ 

8. If the multiplier a is chosen so that it is “modulus-compatible” with to, then potential 
overflow can be averted for large values of to. Two values of a that are often used with 
to = 2 31 - 1 are a = 7 5 = 16807 and a = 48271 [PaMi88], 

9. Successful input modeling for a discrete-event simulation requires a close match 
between the input model and the true underlying probabilistic mechanism associated 
with the system. 

10. One of the first steps in determining an appropriate input model for an element of 
a discrete-event simulation is to assess whether the observations are independent and 
identically distributed. 

11. An input model can be specified in several ways: e.g., using a cumulative distribu- 
tion function, joint probability density function, hazard function, intensity function, or 
variate-generation algorithm. 

12. Many input models rely on parametric probabilistic models such as the binomial, 
normal, and Weibull distributions. Maximum likelihood is typically used to estimate 
parameters of these models. 


© 2000 by CRC Press LLC 


Algorithm 1: Inverse transformation method. 

input: cumulative distribution function F 
output: random variates X from this distribution 

generate XJ uniformly over (0, 1) 

X := F~\U) 


13 . Bezier curves [FlWi93] offer a unique combination of the parametric and nonpara- 
metric approaches. After an initial distribution is fitted to the data set, the modeler 
decides whether differences between the empirical and fitted models represent sampling 
variability (chance variation) or an aspect of the distribution that should be included 
in the input model. 

14 . Multivariate distributions (e.g., the multivariate normal distribution with mean /i 
and variance-covariance matrix S) are considered by [Jo87]. 

15 . Once an input model has been chosen, random variate generation algorithms are 
used to transform random numbers to variates from the input model. 

16 . Devroye [De86] gives algorithms for converting random numbers to random variates 
associated with input models chosen to drive the simulation. 

17. Techniques commonly used for generating random variates from univariate proba- 
bility distributions are: inverse transformation, composition, acceptance/rejection, and 
special properties. 

18 . Algorithm 1 , which shows the inversion method, is based on the probability integral 
transformation. It is assumed that the cumulative distribution function F{x) for the 
input model of interest has the inverse F~ 1 (U). 

19 . Other topics in variate generation include table methods, generating from multi- 
variate distributions, random sampling, estimating integrals, and generating processes 
correlated in time. 


Examples: 

1. Suppose that a sequence of arrival times (e.g., of customers at a bank) is collected 
over a 24-hour time period. A trace-driven input model for the arrival process is gener- 
ated by having arrivals occur at the same times as the observed values. 

2. Let t\, t 2 , . ■ ■ , t n be the arrival times to a queue collected on the time interval (0, c]. 
If the times between arrivals are independent and identically distributed, a parametric 
or nonparametric model can be fitted to the data. In the former case, parameters are 
often estimated by maximizing the likelihood function [LaKe91] 

n 

m= n 

i= 1 

where Xi = U — U - 1 for i = 1, 2, . . n, to = 0, 0 = ( 6 i, 62 , ■ ■ ■ , 6 P ) is a vector of unknown 
parameters, and f(xi,9) is the probability density function of the interarrival times. 

3 . If the interarrival times to a queue (as in Example 2) are not independent and iden- 
tically distributed, then a nonstationary point process might be considered, such as a 
nonhomogeneous Poisson process, where the arrival rate A (f; 9) varies over time. One 
parametric model is the power law process, with intensity function A (t; A, n) = A K K,t Kj ~ 1 
for t > 0, where A and k are positive parameters. The likelihood function for the sin- 
gle realization on (0, c] is L{ A, k) = (IlILi A(t»; A, k)) exp (— J () c A(f; A, k) dt ) . Maximum 
likelihood estimators can be determined by maximizing L{ A, n) or its logarithm with 
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respect to A and k. Confidence regions for the unknown parameters can be found by 
using asymptotic properties of the likelihood ratio statistic or the observed informa- 
tion matrix [LaKe91]. As with all statistical modeling, goodness-of-fit tests should be 
performed in order to assess the model adequacy. 


4 . Weibull distribution: The Weibull distribution has cumulative distribution function 
F(x) = 1 — e~D x ) for x > 0. The inverse cumulative distribution function is F~ 1 {y) = 
[— ln(l - y)} 1 ^ 

for 0 < y < 1. Algorithm 1 can be used to generate a Weibull variate 
[— ln(l - U )] 1 / K 


A 

according to 


X := 


A 


5 . M/M/1 queue: To simulate the operation of a single-server queue (§7.8.1) with 
Poisson arrivals at rate A and exponential service times with mean A, exponentially 
generated variates are needed for the interarrival times {I n } and the service times {S'©. 
These are available as a special case of the Weibull distribution (Example 4) with k = 1 


and can be generated using /„ = 


- ln(l — XJ n ) 
A 


and S n = 


- ln(l - V n ) 
9 


, with the {U n }, 


{V n } generated uniformly over (0, 1). 

A concrete example is provided in the following table, which shows one simulated 
run of an M/M/1 queue with A = 0.5 and y = 0.7. The table shows, in successive 
columns, the following values for each customer n: the interarrival time the arrival 
time A n , the service time S n , the beginning time of service B n , the departure time D n , 
and the waiting time W n = D n — A n . Notice that customers 1,4, 7, 8,9 are served 
immediately and incur no waiting time in the queue. 


customer 

In 

A n 

s n 

B n 

D n 

W n 

1 

5.44 

5.44 

0.78 

5.44 

6.22 

0.78 

2 

0.61 

6.05 

2.77 

6.22 

8.99 

2.94 

3 

0.35 

6.40 

0.96 

8.99 

9.95 

3.55 

4 

4.12 

10.52 

2.42 

10.52 

12.94 

2.42 

5 

0.54 

11.06 

0.88 

12.94 

13.82 

2.76 

6 

2.07 

13.13 

0.87 

13.82 

14.69 

1.56 

7 

6.82 

19.95 

0.86 

19.95 

20.81 

0.86 

8 

2.19 

22.14 

0.76 

22.14 

22.90 

0.76 

9 

4.09 

26.23 

3.31 

26.23 

29.54 

3.31 

10 

0.02 

26.25 

0.01 

29.54 

29.55 

3.30 


7.9.2 OUTPUT ANALYSIS 

Once a verified and validated simulation model has been developed, a modeler typi- 
cally wants to estimate measures of performance associated with outputs of the model. 
Although there are often several performance measures of interest, a single measure of 
performance 9 (e.g., the mean waiting time in a queue) is studied here. 

This section discusses using point estimation to compute an estimate for 9 , de- 
termining a confidence interval for the point estimate, and using variance reduction 
techniques to obtain more precise point estimates. 
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Definitions: 


Suppose {Y t } is the output stochastic process. If the output stochastic process 
consists of independent observations obtained from a population with cumulative dis- 
tribution function Fy, the pth quantile of Fy is the value y p such that Fy(y p ) = p. 
The median of Fy corresponds to p = 0.5. 

n 

The sample mean of the observations Yi, Y 2 , • • ■ , Y n is given by Y = - Yi. 

n i- 1 

If the values Y 1 ,Y 2 , . . . ,Y n are rearranged so that Fm < Y( 2 ) < • • • < Y(„) then Y(, :) is 
the ith order statistic. 

The mean py of the process is the asymptotic mean of the output process {Yi}. 

The variance cry of the process is the asymptotic mean of the output process { (Y, — 

F) 2 }. 

The probability Pr(A) of event A is the asymptotic mean of the output process {/(A)}, 
where / is the 0-1 indicator function for event A. 

The output process F., F 2 , . . . is covariance stationary if, for finite mean p. and finite 
variance er 2 > 0, E(Yi) = p, i = 1 , 2 ,..., Var(F) = cr 2 , i= 1 , 2 ,..., and Cov(F, F+j) is 
independent of i, for j = 1, 2, . . . . 

Variance reduction techniques are strategies for obtaining greater precision for a 
fixed amount of sampling. 


Facts: 

1. The two most common measures of performance to be estimated are means and 
quantiles. 

2. Point estimates for py, a 2 -, Pr(A ) are typically given by the associated sample 
means. 

3. A simple estimator of y p = Fy 1 (p) is F( s ), where s = [p(n + 1)J . This estimator can 
be improved (with respect to bias) by estimating Fy (p) with the linear combination 
(1 — a)F( s ) + aY( s+1 ), where a = p(n + 1) — [p(n + 1)J . 

4. Replication: This is one of the simplest methods of interval estimation, in which 
several runs of a simulation model are used. Classical confidence intervals based on the 
central limit theorem for the measures of interest can then be applied to the output. 

5. The presence of autocorrelation among observations (e.g., the waiting times of adja- 
cent customers in a queue) significantly complicates the statistical analysis of simulation 
output from a single run. 

6. To analyze a single simulation run with autocorrelation present, techniques have 
been developed for determining interval estimates whose actual coverage is close to the 
stated coverage. For many of these techniques, the output is assumed to be covariance 
stationary. These techniques include batch means, overlapping batch means, standard- 
ized time series, regeneration, spectral analysis, and autoregression. 

7. Common random numbers: This is a variance reduction technique in which two 
or more alternative system configurations are analyzed using the same set of random 
numbers for particular purposes (e.g., generating service times). Using common random 
numbers insures that the output differences are due to the configurations rather than 
the sampling variability in the random numbers. 
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8. Antithetic variates : This is a second variance reduction technique, applicable to 
the analysis of a single system. If the random numbers {Ui} are used for a particular 
purpose in one simulation run, then using {1 — Ui} in a second run typically induces 
a negative correlation between the outputs of the two runs. Thus, the average of the 
output measures from the two runs will have a reduced variance. 

9. There are a variety of variance reduction techniques. See Wilson [Wi84] for a detailed 
discussion. 

10. Other topics in output analysis include initialization bias detection, ranking and 
selection, comparing alternative system designs, experimental design, and optimization. 

Examples: 

1. Confidence intervals for expected waiting times: Let Xi, X 2 , . . . , X n be the av- 
erages of the waiting times of customers in a single-server queue from n independent 
replications of a discrete-event simulation model. A 100(1 — a)% confidence interval 
for /i, the steady-state mean waiting time, is 

— s — s 

A toi/2,n—l / < g X T tce/2,n— 1 / ; 

V n v n 

where X is the sample mean, s is the sample standard deviation, and t a / 2>ra _i is the 
1 — f fractile of the t distribution with n— 1 degrees of freedom. The replications must be 
“warmed up” to avoid initialization bias. The asymptotic normality of X\, X 2 , . . . , X n 
is assured by the central limit theorem and independence is based on the use of inde- 
pendent random number streams. 

2. M/M/1 queue: The simulation of Example 5 of §7.9.1 was executed so that the 

first 200 customer wait times were collected. The measure of performance 9 for the 
system is the steady-state expected customer wait time. The initial conditions for 
each replication are an empty system and an idle server. The stopping time for each 
replication is when the 200th customer departs. Running this simulation experiment 
for n = 100 replications gave X = 4.72 and for n = 500 replications gave X = 4.76. 
For this simple queueing system, the steady-state analytical solution is W = = 5.0 

(§7.8.3, Fact 1). These averages are biased low since the early waiting times have a lower 
expected value than the subsequent waiting times as a result of the initial conditions. 
To improve these point estimates, the system was permitted to warm up for the first 100 
customers and the average waiting time was then calculated for the last 100 customers. 
In this case, rerunning the simulation gave the improved estimates X = 5.20 for n = 100 
and X = 4.93 for n = 500. 

3. Common random numbers: Law and Kelton [LaKe91, pp. 620-621] compare the 
M/M/1 and M/M/2 queueing models with a utilization of p = 0.9 using the waiting 
times in the queue of the first 100 customers. With n = 100 independent replications 
of each system, they compare the two models in four ways: 

• independent runs (/); 

• arrival streams using common random numbers (A); 

• service times using common random numbers ( S ); 

• arrival streams and service times using common random numbers (A& S). 
Common random numbers is a variance reduction technique that feeds identical in- 
terarrival and/or service times into the two different queueing models to increase the 
likelihood that observed differences in the waiting times are due to the system configura- 
tions (M /M/1 versus M/M/2) rather than sampling error. The mean half-widths of the 
confidence intervals (a = 0.10) reported for their example are 0.70(1), 0.49(A), 0. 49(5*), 
and 0.04(A& S). 
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7.9.3 SIMULATION LANGUAGES 


This section considers the history and features of simulation programming languages 
developed over the years. 

Facts: 

1. The use of a general-purpose simulation programming language (SPL) expedites 
model development, input modeling, output analysis, and animation. In addition, SPLs 
have accelerated the use of simulation as an analysis tool by bringing down the cost of 
developing a simulation model. 

2 . In a history of the development of SPLs from 1955 to 1986, Nance [Na93] defines 
six requirements that a SPL must meet: 

• random number generation; 

• variate generation; 

• list processing capabilities so that objects can be created, altered, and deleted; 

• statistical analysis routines; 

• summary report generators; 

• a timing executive or event calendar to model the passage of time. 

3 . SPLs may take the form of: 

• a set of subprograms in a general purpose language (GPL) such as Fortran or C 

that can be called to meet these six requirements; 

• a preprocessor that converts statements or symbols to lines of code in a GPL; 

• a conventional programming language. 

4 . The following table shows a division of the historical record into five distinct periods, 
including the names of several languages that came into existence in each period. 


period 

characteristics 

languages 

1955-1960 

period of search 

GSP 

1961-1965 

the advent 

CLP, CSL, DYNAMO, GASP, GPSS, 
MILITRAN, OPS, QUIKSCRIPT, 
SIMSCRIPT, SIMULA, SOL 

1966-1970 

formative period 

AS, BOSS, Q-GERT, SLANG, SPL 

1971-1978 

expansion period 

DRAFT, HOCUS, PBQ, SIMPL 

1979-1986 

consolidation and 
regeneration 

INS, SIMAN, SLAM 


5 . The General Purpose System Simulator (GPSS) was first developed on various IBM 
computers in the early 1960s. Algol-based SIMULA was also developed in the 1960s 
and had features that were ahead of its time. These included abstract data types, 
inheritance, the co-routine concept, and quasi-parallel execution. 

6. SIMSCRIPT was developed by the RAND Corporation with the purpose of decreas- 
ing model and program development times. SIMSCRIPT models are described in terms 
of entities, attributes, and sets. The syntax and program organization were influenced 
by Fortran. 
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7 . The Control and Simulation Language (CSL) takes an “activity scanning” approach 
to language design, where the activity is the basic descriptive unit. 

8. The General Activity Simulation Program (GASP), in common with several other 
languages, used flow-chart symbols to bridge the gap between personnel unfamiliar 
with programming and programmers unfamiliar with the application area. Although 
originally written in Algol, GASP provided Fortran subroutines for list-processing ca- 
pabilities (e.g., queue insertion). 

9. GASP was a forerunner to both the Simulation Language for Alternative Modeling 
(SLAM) and SIMulation ANalysis (SIMAN) languages. 

10 . SLAM was the first language to include three modeling perspectives in one lan- 
guage: network (process orientation), discrete-event, and continuous (state variables). 

11 . SIMAN was the first major SPL executable on an IBM PC. 

12 . Simulation software in the 1990s has mushroomed, with numerous packages and 
languages available both for general purpose and application-specific simulations. Spe- 
cial purpose and integrated packages are widespread and available on desktop comput- 
ers. The 1997 survey [SW97] compares 46 products, having a wide range of features 
and capabilities, and the 1997 review [1197] compares 65 products. 

13 . A recent trend has been the addition of animation to intelligently view simulation 
output. Surveys of web-based simulations can be found at the sites: 

• http : //ms . ie . org/websim/ survey/ survey.html 

• http: //www. cise .ufl . edu/~f ishwick/websim.html 

14 . Software for carrying out Monte Carlo simulation can be found at the site: 

• http : //random. mat . sbg . ac . at/others/#MCSoftware 

15. A number of commercial and freeware/shareware simulation packages are listed at 
the site: 

• http : //www. isye .gatech. edu/inf orms-sim/#ware 

• http : //www. isye .gatech. edu/inf orms-sim/comm.html 

• http : //ws3 . atv . tuwien . ac . at/eurosim/ 
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INTRODUCTION 

A graph is conceptually a spatial configuration with a finite set of points and a finite 
set of lines (possibly curved) joining one point to another (or to itself). Graph theory 
has its origins in many disciplines. Graphs are natural mathematical models of physical 
situations in which the points represent either objects or locations and the lines repre- 
sent connections. Graphs are also used to model sociological and abstract situations in 
which each line represents a relationship between the entities represented by the points. 
Applications of graphs are wide-ranging — in areas such as circuit design, communi- 
cations networks, ecology, engineering, operations research, counting, probability, set 
theory, information theory, and sociology. 

This chapter contains an extensive treatment of the various properties of graphs. 
Further topics in graph theory are covered in Chapter 9 Trees and in Chapter 10 Net- 
works and Flows. 


GLOSSARY 

acyclic digraph : a digraph containing no directed cycles. 
acyclic graph: a graph containing no cycles. 

adding a crosscap to a surface: an operation that increases the crosscap number 
of a nonorientable surface by 1. 

adding a handle to a surface: an operation that increases the genus of an orientable 
surface by 2. 
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adjacency matrix (of a digraph): for a digraph D , the square matrix Ap with 
A d [*, j] = the number of edges from vertex Vi to vertex Vj. 

adjacency matrix (of a graph): for a graph G, the square matrix Aq with Ao[i,j] = 
the number of edges between vertices v, t and Vj. 

adjacent edges : two edges with a common endpoint. 

adjacent vertex (in a digraph) from [to] a vertex u: a vertex v such that there is an 
arc from u to v [to u from v]. 

adjacent vertices : two vertices that are endpoints of the same edge. 

admittance matrix: given a graph G, the matrix D — A where D is the diagonal 
matrix with the degree sequence of G on the diagonal and where A is the adjacency 
matrix; synonym for Laplacian. 

algebraic specification (of a graph): a form of specification that uses group elements 
(see Chapter 5) in the vertex and edge names and uses the group operation in the 
incidence rule; a highly condensed form of specification because a single entry can 
specify the endpoints of all the edges in a class as large as the size of the group. 

almost every (a. e.) graph has property P: the statement that the probability 
that a random n- vertex graph has property P approaches 1 as n — > oo. 

antichain : a hypergraph in which no edge contains any other edge. 

arc: another name for a directed edge of a graph. 

articulation point: synonym for cutpoint. 

attachment of a bridge of a subgraph: given a bridge B of a subgraph //. a vertex 

of Bn H. 

attribute (of the edge-set or vertex-set): any additional feature, such as length, cost, 
or color, that enables a graph to model a real problem. 

automorphism: given a graph or digraph, an isomorphism from the graph or digraph 
to itself. 

automorphism group : the collection Aut(G) of all automorphisms of a graph or 
digraph G under the operation of composition. 

basis (for a digraph): a set of vertices V' of the digraph such that every vertex not 
in V' is reachable from V' and no proper subset of V' has this property. 

bipartite : property of a graph that its vertices can be partitioned into two subsets, 
called the “parts”, so that no two vertices within the same part are adjacent. 

block: in a graph, a maximal nonseparable subgraph. 

bond: a minimal disconnecting set of edges. 

boundary (of a region of a graph imbedded in a surface): given a region R of a graph G 
imbedded in a surface, the subgraph of G containing all vertices and edges incident 
on R-, it is denoted dR. 

bouquet: a graph B n with one vertex and n self-loops. 

branch: synonym for arc (i.e. , a directed edge). 

bridge (edge): a cut-edge. 

bridge (of a subgraph H): a maximal connected subgraph in which no vertex of H 
has degree greater than one. 

cactus: a connected graph in which every block is either an edge or a cycle. 
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cartesian product: given graphs G and H , the graph G x H whose vertex set is the 
cartesian product Vq x Vh and whose edge set is (Vq x Eh) U (Eg x Vh)- 

caterpillar : a tree that contains a path such that every edge has one or both endpoints 
in that path. 

Cayley graph (or digraph): a graph that depicts a group with a prescribed set of 
generators; the vertices represent group elements, and the edges or arcs (often said 
to be “color-coded” for the generators) represent the product rule. 

cellular imbedding: an imbedding such that every region is equivalent to the interior 
of the unit disk (the region { (x, y) \ x 2 + y 2 < 1 } of the plane). 

center: in a connected graph, the set of vertices of minimum eccentricity. 

chain: a simple hypergraph in which, given any pair of edges, one edge contains the 
other. 

characteristic polynomial (of a graph): the characteristic polynomial of its adja- 
cency matrix. 

characteristic value: See eigenvalue. 

characteristic vector: See eigenvector. 

chromatic index (of a graph or hypergraph): See edge chromatic number. 

chromatic number (of a graph): the minimum number x(G) of colors needed to color 
the vertices of a graph G so that no vertex is adjacent to a vertex of the same color; 
alternate notation cr(G). 

chromatic number (of a hypergraph): the smallest number x(H) of independent sets 
required to partition the vertex set of H. 

chromatic number (of a map): the minimum number x(M) of colors needed to color 
the regions of the map M so that no color meets itself across an edge; alternate 
notation cr(M). 

chromatic number (of a surface): the largest map chromatic number x(S) taken over 
all maps on the surface S; alternate notation cr(S). 

chromatically n-critical graph: an n-chromatic graph G such that x(G — e) = n — 1 
no matter what edge e is removed. 

circuit: synonym for a closed walk, a closed trail, or a cycle, depending on the context. 

clique (in a graph): in a graph G, a complete subgraph of G contained in no larger 
complete subgraph of G. 

clique (in a hypergraph): a simple hypergraph such that every pair of edges has 
nonempty intersection. 

clique number (of a graph): the number u>(G) of vertices of a largest clique in the 
graph G. 

clique number (of a hypergraph): the largest number co(H) of edges of any partial 
clique in the hypergraph H . 

clique partition number: for a hypergraph H, the smallest number cp(H) of cliques 
required to partition the edge set. 

closed walk (trail or path): a walk, trail, or path whose origin and terminus are the 
same. 

n-colorable graph: a graph having a vertex coloring using at most n colors. 
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n-colorable map: a map having a coloring using at most n colors. 
comparability graph: a graph that admits a transitive orientation. 
complement (of a graph): See edge-complement. 

complete bipartite graph: a bipartite graph K r s whose vertex set has two parts, 
of sizes r and s, respectively, such that every vertex in one part is adjacent to every 
vertex in the other part. 

complete graph: the simple graph K n with n vertices in which every pair of vertices 
is adjacent. 

complete hypergraplr. the simple n-vertex hypergraph K* n in which every subset of 
vertices is an edge. 

complete multipartite (or k-partite) graph: a fc-partite simple graph such that 
every pair of vertices from different parts is joined by an edge. Such a graph is 
denoted by K n where n \, . . . ,rifc denote the sizes of the parts. 

complete r -uniform hypergraph: the simple n-vertex hypergraph K' n in which ev- 
ery r-element subset is an edge. 

complete set of invariants : a set of invariants that determine a graph or digraph 
up to isomorphism. 

component: given a graph, a maximal connected subgraph; the number of components 
of a graph G is denoted Po(G). 

connected: property of a graph that each pair of vertices is joined by a path. 
connectivity: See vertex connectivity. 

contraction, elementary (of a graph): the operation of shrinking an edge to a point, 
so that its endpoints are merged, without otherwise changing the graph. 

contraction, elementary (of a simple graph): replacing two adjacent vertices u and v 
by one vertex adjacent to all other vertices to which u or v were adjacent. 

contraction: for a graph, the composition of a sequence of elementary contractions. 
converse: for a digraph, the digraph obtained by reversing the direction of every arc. 
crosscap: a subportion of a surface that forms a Mobius band. 

crosscap number (of a nonorientable surface): for a nonorientable surface S, the 
maximum number 7 (S') of disjoint crosscaps one can find on the surface. The nonori- 
entable surface of crosscap number k is denoted N k . 

crossing number: for a graph G, the minimum number v(G) of edge-crossings taken 
over all normalized planar drawings of G. 

cube graph: See hypercube graph. 

cut-edge: given a graph G, an edge e such that G — e has more components than G. 
cut-vertex (or cutpoint): given a graph G, a vertex v such that G — v has more 
components than G. 

cycle: a closed path of positive length. See also k-cycle. 

cycle, directed: a closed directed walk in which all the vertices except the first and 
last are distinct. 

cycle graph: a graph C n with n vertices and n edges that form a simple circuit. 

cycle rank: given a connected graph G, the number j3\ (G) of edges in the complement 
of a spanning tree for G, that is, \Eq\ — \ Vq\ + 1. 
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DAG: an acronym for directed acyclic graph. 

degree (of a vertex in a graph): given a vertex v, the number deg(v) of instances of v 
as an endpoint; that is, the number of proper edges incident on v plus twice the 
number of loops at v. 

degree (of a hypergraph vertex): given a vertex x, the number deg(x) of hypergraph 
edges containing x. 

degree sequence of a graph: the sequence of the degrees of its vertices, most often 
sorted into size order, ascending or descending. 

deleting an edge from a graph: given a graph G and an edge e of G, an operation 
that results in the subgraph G — e, which contains all the vertices of G and all edges 
except e. 

deleting a vertex from a graph: given a graph G and a vertex v of G, an operation 
that results in the subgraph G — v, which contains all vertices of G except v and all 
the edges of G except those incident with v. 

diameter: given a connected graph, the maximum distance between two of its vertices. 

diconnected digraph: See strongly connected digraph. 

digraph (or directed graph): a graph in which every edge is directed. 

dipole: the graph D n with two vertices and a multi-edge of multiplicity n joining them. 

directed cycle , path , trail, walk: See cycle, directed, etc. 

directed graph: See digraph. 

direction (on an edge): a sense of forward progression from one end to the other, 
usually marked either by ordering its endpoints or by an arrowhead. 

disconnected (digraph): a digraph whose underlying graph is disconnected. 

disconnecting set of edges (in a connected graph): a set whose removal yields a 
nonconnected graph. 

disconnecting set of vertices (in a connected graph) : a set whose removal yields a 
nonconnected graph. 

distance (between two vertices of a connected graph): given two vertices v and w, the 
length d(y, w) of a shortest path between them. 

distance (between two vertices of a connected digraph): given two vertices v and w, 
the length d(v, w) of a shortest directed path between them. 

dodecahedral graph: the 1-skeleton of the dodecahedron, which is a 3-dimensional 
polyhedron whose 12 faces are all pentagons; this graph has 20 vertices, each of 
degree 3, and 30 edges. 

downset: a simple hypergraph in which every subset of every edge is also an edge of 
the hypergraph. 

dual graph imbedding: a new graph imbedding obtained by placing a dual vertex in 
the interior of each existing ( “primal” ) region and by drawing a dual edge through 
each existing (“primal”) edge connecting the dual vertices on its opposite sides. 

dual (of a hypergraph): given a hypergraph H, the hypergraph H* whose incidence 
matrix is the transpose of the incidence matrix M(H). 

eccentricity (of a vertex): given a vertex v in a connected graph, the greatest distance 
from v to another vertex. 
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edge: a line, either joining one vertex to another or joining a vertex to itself; an 
element of the second constituent set of a graph. 

edge chromatic number (of a graph): given a graph G, the smallest number n such 
that G is n-edge colorable, written Xi(G) or ecr(G). 

edge chromatic number (of a hypergraph): given a hypergraph H , the smallest 
number q(H) of matchings required to partition the edge set of H. 

n-edge colorable: property of a graph that it has an edge coloring using at most n 
colors. 

edge coloring: an assignment of colors to the edges of a graph so that adjacent edges 
receive different colors. See also n-edge colorable. 

edge connectivity : the cardinality k'(G) of the smallest disconnecting set of edges 
in graph G. See also k-edge connected. 

edge cut: See disconnecting set. 

edge independence number : the cardinality oq (G) of a largest independent set of 
edges in graph G. 

edge-complement: given a graph G, the graph G with the same vertex set as G, but 
in which two vertices are adjacent if and only if they are not adjacent in G. 

edge-deleted subgraph: any subgraph obtained from a graph by removing a single 
edge. 

edge-reconstructible graph: a graph which is uniquely determined by its collection 
of edge-deleted subgraphs. 

edge-reconstructible invariant: an invariant which is uniquely determined by the 
collection of edge-deleted subgraphs of a graph. 

eigenvalue (of a matrix): given a matrix A, a number A such that Ax = \x for some 
vector x ^ 0. 

eigenvector (of a matrix): given a matrix A, a nonzero vector x such that Ax = Xx. 

embedding: See imbedding. 

empty graph: sometimes, a graph with no edges; other times, a graph with no vertices 
or edges. See null graph. 

endpoints: the vertices that are joined by the edge. 

Euler characteristic: given a surface S, an invariant x(S') of the surface itself, given 
by the formula x(S’) = \V\ — \E\ + |F| where V, E, and F are the vertices, edges 
and faces of any cellular drawing of any graph on that surface; equivalently, 2 — 2 g 
for the orientable surface S g of genus g , and 2 — k for the nonorientable surface N k 
of crosscap number k. 

Euler tour: a closed Euler trail. 

Euler trail: a trail that contains all the edges of the graph. 

Euler ian graph: a graph that has an Euler tour. 

exterior region: in a planar graph drawing, the region that extends to infinity. 

extremal graph: given a set Q of graphs and an integer n, an n-vertex graph with 
ex(Q\ n) edges that contains no member of Q. 

extremal number: given a set Q of graphs, the greatest number ex(Q; n) of edges in 
any n-vertex simple graph that does not contain some member of Q as a subgraph. 
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face: given an imbedding of a graph in a surface, a region plus its boundary. 

forest : any graph without cycles. 

four color theorem: the fact that every planar map can be properly colored with at 
most four colors, proved in 1976 by Haken and Appel after over a century of active 
investigation. 

general graph: another name for a graph that might have loops. 

generating set (for a group) : a subset of group elements such that every group element 
is a product of generators. (Note: the identity of a group is the empty product). 

genus (of an orientable surface): for a surface S, the maximum number 7 (S) of disjoint 
handles one can find on the surface; or equivalently, the maximum number of disjoint 
closed curves one can cut open without disconnecting the surface. The orientable 
surface of genus g is denoted S g . 

genus (of a graph) : the minimum genus of a surface in which the graph can be cellularly 
imbedded. 

girth : given a graph, the number of edges in a shortest cycle, if there is at least one 
cycle; undefined if the graph has no cycles. 

graph: a set V of vertices and a set E of edges such that all the endpoints of edges 
in E are contained in V, written G = (V,E), ( Vg,Eg ), or (V(G),E(G)). 

graph model : any configuration with underlying graph structure, and possibly some 
additional attributes on its edges and/or vertices, such as length, direction, or cost. 

graph sum: given graphs G and H, the graph G + H whose vertex set and edge set 
are the disjoint unions, respectively, the disjoint union of the vertex sets and edge 
sets of G and H. 

graphical sequence: a sequence of nonnegative integers such that there is a simple 
graph for which it is the degree sequence. 

Gray code: a cyclic ordering of all 2 k bitstrings of length k, such that each bitstring 
differs from the next in exactly one bit entry. 

Hamilton cycle: a spanning cycle, that is, a cycle including all the vertices of a graph. 

Hamiltonian graph: a graph that contains a Hamiltonian cycle. 

Hamilton path: a path that includes all the vertices of a graph. 

head (of an arc): the vertex the arc goes to. 

Hoffman polynomial (of a graph) : a polynomial p(x) of minimum degree such that 
p(A) = J, where A is the adjacency matrix and J is the matrix with every entry 
equal to 1 . 

homeomorphic graphs: two graphs that can both be obtained from the same graph 
by a sequence of edge subdivisions. 

hypercube graph: a graph Q ^ whose 2 d vertices could be labeled bijectively with the 
bitstrings of length d, so that two vertices are adjacent if and only if their labels 
differ in exactly one bit. 

hypergraph: a finite set V of “vertices” together with a finite collection E of “edges” 
(sometimes, “hyperedges”), which are arbitrary subsets of V, written H = (V,E). 

icosahedral graph: the 1-skeleton of the icosahedron, which is a 3-dimensional poly- 
hedron whose 20 faces are triangles; this graph has 12 vertices, each of degree 5, and 
30 edges. 
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imbedding (of a graph in a surface) : a drawing of the graph onto some surface so that 
there are no edge-crossings; also embedding. 

incidence rule : any rule specifying the endpoints of every edge of a graph. 

incidence matrix (of a digraph with no self-loops): given a digraph D with no self- 
loops, a matrix Mj (or Mix,) with 

! 0, if vertex is not an endpoint of arc e.j 
+1, if Vi is the head of arc Bj 
—1, if Vi is the tail of arc ej. 

incidence matrix (of a graph): given a graph G, a matrix Mi (or Mix) with 

{ 0, if vertex u, is not an endpoint of edge ej 
1, if ej is a proper edge with endpoint Vi 
2, if ej is a loop at Vi. 

incidence matrix (of a hypergraph): given a hypergraph H = (V) E) with E = 
{ ei, e 2 , . . . , e m } and V = { x\, x%, . . . x„ }, the matrix [rriij] where 

_ f 1, if Xj £ e, 

( 0, otherwise. 

incident edge (from [to] a digraph vertex): given a vertex u in a digraph, a directed 
edge e such that u is the tail [head] of e. 

incident edge (in a graph): given a vertex u in a graph, an edge e such that u is an 
endpoint of e. 

incident-edge table (for a graph): a table that lists, for each vertex, the edges having 
that vertex as an endpoint. 

in-degree: given a vertex v, the number of arcs with head v. 

independent subset (in a graph): given a graph G, a subset of either V(G) or E(G) 
such that no two elements are adjacent in G. 

independent subset (of hypergraph vertices): a set of vertices which does not (com- 
pletely) contain any edge of the hypergraph. 
independence number (of a graph): the number a(G) of vertices in the largest 
independent subset in G. 

independence number (of a hypergraph): the maximum number a(H) of vertices 
that form an independent set in H . 

induced subgraph (on a vertex subset): the subgraph of a graph G containing every 
edge of G that joins two vertices of the prescribed vertex subset. 
invariant : a parameter or property of graphs that is preserved by isomorphisms. 

intersection graph (for a family of subsets): given a family T — {S'.,} of subsets of a 
set S, the graph whose vertex set is T and such that there is an edge between each 
pair of subsets S, and Sj whose intersection is nonempty. 

intersection graph (of a hypergraph): given a hypergraph El, the simple graph 1 (H) 
whose vertices are the edges of H, such that two vertices of 1 (H) are adjacent if and 
only if the corresponding edges of H have nonempty intersection. 

interval graph: the intersection graph of a family of subintervals of [0, 1]. 

irreducible tournament: a tournament with no bipartition Vi, V2 of the vertices such 
that all arcs between Vi and V2 go from V\ to V2. 

isolated point: a vertex of a graph that is not the endpoint of any edge. 
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isomorphic (pair of graphs): a pair of graphs with identical mathematical structure; 
formally, a pair of graphs such that there is an isomorphism from one to the other. 

isomorphism (of digraphs) : an isomorphism of the underlying graphs of two digraphs 
such that the edge correspondence preserves direction. 

isomorphism (of graphs): given graphs G and H , a pair of bijections fy: Vq — > Vh 
and /e - Eg Eh such that for every edge e £ Eg, the endpoints of e are mapped 
onto the endpoints of /^(e); / is usually used for both the vertex function fy and 
the edge function Je- 

isomorphism (of simple graphs) : a one-to-one correspondence between the vertices of 
two graphs such that a pair of vertices are adjacent in one graph if and only if the 
corresponding pair of vertices are adjacent in the other graph. 

isomorphism type: given a graph [digraph] G, the class of all graphs [digraphs] 
isomorphic to G. 

join: given graphs G and H , the graph G* H obtained by adding to the disjoint union 
G + H an edge from each vertex in G to each vertex in H. 

k-connected: property of a graph G that the smallest size of a disconnecting set of 
vertices is at least k: that is, k(G) > k. 

k-cycle: a cycle of length k. 

k-edge connected: property of a graph G that k'(G) > k. 

k-partite graph: a graph whose vertex set can be partitioned into at most k parts in 
such a way that each edge joins different parts, never the same part. Equivalent to 
a k-colorable graph. 

k-regular: property of a graph or hypergraph that all its vertices have degree k. 

king: a vertex in a digraph that can reach all other vertices by paths of length 1 or 2. 

Klein bottle: the nonorientable surface N2 with two crosscaps. 

Kuratowski graphs: the complete graph K 5 and the complete bipartite graph ^3 3. 

labeled graph: in applied graph theory, any graph in which the vertices and/or edges 
have been assigned labels; in pure graph theory, a graph in which standard labels 
Vi, V2, ■ ■ ■ , v„ have been assigned to the vertices. 

Laplacian (of a graph): given a graph G, the matrix D — A where D is the diagonal 
matrix with the degree sequence of G on the diagonal and where A is the adjacency 
matrix. 

length: given a walk, the number of edge-steps in the sequence that specifies the walk. 

line: synonym for edge, or refers to what is modeled by an edge. 

line graph: given a graph G, the graph L(G) whose vertices correspond to the edges 
of G, with two vertices being adjacent in L{G) whenever the corresponding edges 
have a common endpoint in G. 

linear extension ordering: a consecutive labeling v±,V2, ■ ■ ■ ,v n of the vertices of a 
digraph such that, if there is an arc from u* to Vj, then i < j. 

link: See proper edge. 

loop (or self-loop): an edge joining a vertex to itself. 

map: an imbedding of some graph on a surface. 

map chromatic number: See chromatic number of a map. 
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map coloring: an assignment of colors to the regions of a map so that adjacent regions 
receive different colors. 

mapping: given graphs G and H , a vertex function f: Vq ~ f Vh and an edge func- 
tion f: Eq —■ ► Eh that correspond together to a continuous function from a spatial 
representation of G in Euclidean space to a spatial representation of H. 

matching: a set of pairwise disjoint edges in a graph or hypergraph. 

matching number: in a graph, the maximum number of pairwise disjoint edges of 
the graph; in a hypergraph H, the maximum number v{H) of pairwise disjoint edges 
of H , that is, the cardinality of the largest partial of H which forms a matching. 

mesh (of trees): a graph obtained by construing each row and each column of a 2 d x 2 d 
array of vertices as the leaves of a complete binary tree. 

minor: given a graph G, any graph that can be obtained from G by a sequence of edge 
deletions and contractions. 

Mobius band: the surface obtained from a rectangular sheet by pasting the left side 
to the right with a half-twist. 

multi-arc: two or more arcs, all of which have the same head and all of which have 
the same tail. 

multi-edge: a set of at least two edges, all of which have the same endpoints. 

multi-graph: a graph with multi-edges. 

neighbor: given a vertex v, any vertex adjacent to v. 

node: a vertex, or refers to what is modeled by a vertex. 

nonorientable surface : a surface such that some subportion forms a Mobius band. 

nonorientable surface of crosscap number k: the surface Ay obtained by adding k 
crosscaps to a sphere. 

nonplanar : property of a graph that it cannot be drawn in the plane without crossings. 
nonseparable: property of a connected graph that it has no cut-vertices. 
normal: property of a hypergraph H that q(H) = A (H). 

normalized drawing: the usual way a graph is drawn, avoiding pathological con- 
trivances such as overloaded crossings (i.e., more than two edges). 

null graph: a graph with no vertices or edges. 

obstruction to n-coloring: synonym for chromatically (n + l)-critical graph, since 
a chromatically (n + l)-critical subgraph prevents n-chromaticity. 

octahedral graph: the 1-skeleton of the 3-dimensional octahedron, or sometimes, a 
generalization of this graph. 

1-skeleton (of a polyhedron): the graph whose vertices and edges are, respectively, 
the vertices and edges of that polyhedron. 

open: property of a walk, trail, or path that its final vertex is different from its initial 
vertex. 

order (of a graph): given a graph G, the cardinality Vq of the vertex set. 
order (of a hypergraph edge): the number of vertices in the edge. 

orientable surface: any surface obtainable from a sphere by adding handles, or (al- 
ternatively) any surface that does not contain a Mobius band. 
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orientable surface of genus g: the surface S g obtained by attaching g handles to a 
sphere. 

orientation: an assignment of a direction to every edge of a graph, making it a digraph. 

origin (of a walk): the initial vertex of the walk. 

out-degree: given a vertex v, the number of arcs with tail v. 

partial: for a hypergraph H = (V. E ), a hypergraph H' = (Vj E') such that E' C E. 

path: a trail in which all of its vertices are different, except that the initial and final 
vertices may be the same. See also u-v-path. 

path, directed: a directed trail in which no vertex is repeated. 

perfect graph: a graph such that every induced subgraph has vertex chromatic num- 
ber equal to its clique number. 

permutation graph: a graph whose vertices represent the objects of a permutation 
group and whose edges represent the action of a generating set of permutations. 

Petersen graph: a 3-regular 10-vertex graph that looks like a 5-cycle joined by its 
vertices to the vertices of a 5-pointed star drawn in its interior. 

planar: property of a graph that it can be drawn in the plane without crossings. 

Platonic graph: the 1-skeleton of a Platonic solid. 

Platonic solid: any of five 3-dimensional polyhedra whose sides are all identical reg- 
ular polygons. 

polyhedron: a generalization of a polygon to higher dimensions; usually a solid 3- 
dimensional figure subtended by planes. 

product graph: See cartesian product. 

projective plane: the nonorientable surface Ni with one crosscap. 

proper edge (or link): an edge with two distinct endpoints. 

pseudo-graph: synonym for a graph with loops. 

quotient: given a graph G , any graph H such that there exists a graph mapping of G 
onto H . 

r-partite hypergraph: an r-uniform hypergraph whose vertex set can be partitioned 
into r blocks so that each edge intersects each block in exactly one vertex. 

r-uniform : property of a uniform hypergraph that r is the common edge-order. 

radius : for a connected graph G, the minimum eccentricity among the vertices of G. 

Ramsey number, ( classical ): the number r(m,n), which is the smallest positive 
integer k such that every simple graph with k vertices either contains K m as a 
subgraph or has a set of n independent vertices. 

Ramsey number: the number R(G,H), which is the smallest positive integer k such 
that, if the edges of K k are bipartitioned into red and blue classes, then either the 
red subgraph contains a copy of G or else the blue subgraph contains a copy of H. 

random graph on n vertices: an n-vertex graph generated by a probability distri- 
bution, in which each edge is as likely to occur as any of the others. 

reachable vertex (from vertex u): a vertex v such that there is a u, f-path in G. 

reconstructive: property of a graph that it is uniquely determined by its collection 
of vertex-deleted subgraphs. 
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reconstructible invariant: an invariant which is uniquely determined by the collec- 
tion of vertex-deleted subgraphs of a graph. 

reducible : property of a digraph that its vertex set V can be partitioned into a disjoint 
union V\ U V 2 so that all arcs joining Vi and V 2 go from V) to V 2 . 

region: given a graph imbedded in a surface, a maximal expanse of surface containing 
no vertex and no part of any edge of the graph; i.e., any of the pieces of surface 
subtended by the graph. 

regular: property of a graph or hypergraph that all its vertices have the same degree. 
See also k- regular. 

representation (of a graph): a description of the graph, possibly without names for 
the vertices and edges. 

rotation system (of an imbedding): a list of the cyclic orderings of the incidence of 
edges at each vertex. 

Schreier graph: a graph that depicts the cosets of a prescribed subgroup of a group, 
with a prescribed set of generators; the vertices represent cosets, and the edges (often 
said to be “color-coded” for the generators) represent the product rule. 

self-complementary: property of a graph that it is isomorphic to its edge-comple- 
ment . 

self-loop: an edge that joins a vertex to itself; see loop. 

simple digraph: See strict digraph. 

simple (or simplicial) graph: a graph with no loops or multi-edges. 

simple hypergraph: a hypergraph with no repeated edges. 

sink: a digraph vertex with out-degree zero. 

source: a digraph vertex with in-degree zero. 

spanning subgraph: a subgraph of a given graph G that includes all vertices of G. 

specification (of a graph): a list of its vertices and a list of its edges, with an unam- 
biguous incidence rule for determining the endpoints of every edge. 

spectrum (of a graph): the spectrum of its adjacency matrix. 

strict: property of a digraph that it has no self-loops and no pair of arcs with the same 
tail and head. 

strong component: in a digraph, a maximal subdigraph that is strongly connected. 

strong orientation: given a graph, an assignment of a direction to every edge, making 
it a strongly connected digraph. 

strong tournament: a tournament in which there is a directed path from every vertex 
to every other vertex. 

strongly connected: property of a digraph that every vertex is reachable from every 
other vertex. 

strongly regular graph (with parameters (n, k, r, s)): an n- vertex, fc-regular graph 
in which every adjacent pair of vertices is mutually adjacent to r other vertices, and 
in which every pair of nonadjacent vertices is mutually adjacent to s other vertices; 
by convention, strongly regular graphs are connected with at least one edge. 

subdivision (of an edge): the operation of inserting a new vertex into the interior of 
the edge, thereby splitting it into two edges. 
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subdivision: given a graph, any new graph obtaining by subdividing one or more 
edges of the original graph one or more times. 

subgraph : given a graph G, a graph H whose vertices and edges are all in G. 

tail (of an arc): the vertex the arc goes from. 

terminus (of a walk): the last vertex of the walk. 

tetrahedral graph: another name for the complete graph K 4 , resulting from the 
fact that it is equivalent to the 1-skeleton of the 4-sided Platonic solid called a 
tetrahedron. 

thickness: given a graph G, the minimum number 9(G) of planar subgraphs whose 
union is G. 

topological sort (or topsort): an algorithm that assigns a linear extension ordering 
to a DAG (not quite a sort, in the usual sense of sorting, and not used by topologists, 
despite the name). 

torus: the surface of a doughnut; the orientable surface Si of genus 1. 

tough graph: a connected graph G such that for every nonempty set S of vertices, 
the number of components of the graph G — S does not exceed ,S'|. 

tournament: a digraph with exactly one arc between each pair of distinct vertices. 

trail: a walk in which no edge occurs more than once. 

trail , directed: a directed walk in which no arc is repeated. 

transitive: property of a digraph that whenever it contains an arc from u to v and an 
arc from v to w , it also contains an arc from u to w. 

transitive orientation: given a graph, an assignment of a direction to every edge, 
making it a transitive digraph. 

transmitter: in a digraph, a vertex that has an arc to every other vertex. 

transversal: in a hypergraph, a set of vertices which has nonempty intersection with 
every edge of the hypergraph. 

transversal number: the minimum number t(H) of vertices taken over all transver- 
sals of H. 

tree: a connected graph without any cycles as subgraphs. 

trivial graph: the graph with one vertex and no edges. 

Turan graph: the n- vertex fc-partite simple graph T^(n) with the maximum number 
of edges. 

u,v-path: a path whose origin is the vertex u and whose terminus is the vertex v. 

underlying graph: given a digraph, the graph obtained from the digraph by stripping 
the directions off all the arcs. 

uniform: property of a hypergraph that all edges have the same number of vertices. 
See also r- uniform. 

unilaterally connected (or unilateral ) : property of a digraph that for every pair of 
vertices it, v, there is either a uu-path or a rm-path. 

upset: a simple hypergraph in which every superset of every edge is also an edge of 
the hypergraph. 

valence: a synonym for degree , adapted from molecular bonds in chemistry. 
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vertex: a point; an element of the first constituent set of a graph. 

vertex coloring: an assignment of colors to the vertices of a graph so that adjacent 
vertices receive different colors. 

( vertex ) connectivity : the smallest number k(G) of vertices whose removal discon- 
nects the graph; by convention, n(K n ) = n — 1 . 

vertex cut: See disconnecting set. 

vertex-deleted subgraph : any subgraph obtained from a graph by removing a single 
vertex and all of its incident edges. 

vertex invariant: a property at a vertex which is preserved by every isomorphism. 

walk: an alternating sequence vq, ei, iq, . . . , e r , v r of vertices and edges where consec- 
utive edges are adjacent, so that each edge e, joins vertices Uj_i and v t . 

walk, directed: an alternating sequence of vertices and arcs vo, e±, Vi, e ?, . . . , e n , v n 
where the arcs align head to tail, so that each vertex is the head of the preceding 
arc and the tail of the subsequent arc. 

weakly connected (or weak) digraph: a digraph whose underlying graph is con- 
nected. 

weighted graph: a graph model in which each edge is assigned a number called the 
weight or the cost. 

wheel graph: an {n + l)-vertex graph W n that “looks like” a wheel whose rim is an 
n-cycle and whose hub vertex is joined by spokes to all the vertices on the rim. 


8.1 INTRODUCTION TO GRAPHS 

Graphs are highly adaptable mathematical structures, and they can be represented on 
a computer so that with each new application that arises, existing algorithms can be 
reused without rewriting. This section provides some of the basic terminology and 
operations needed for the study of graphs and lists several useful families of graphs. 


8.1 .1 VARIETIES OF GRAPHS AND GRAPH MODELS 

Due to the vast breadth of the usefulness of graphs, the terminology varies widely, not 
only from one graph variety to another, but also from one application to another. The 
table in Fact 1 gives synonyms for several terms. 


Definitions: 

A vertex is usually conceptualized as a point. Abstractly, it is a member of the first 
of two sets that form a graph. 

An edge is usually conceptualized as a line, either joining one vertex to another or 
joining a vertex to itself. Abstractly, it is a member of the second of two sets that form 
a graph. 

A proper edge (or link ) is an edge that joins one vertex to another. 
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A loop (or self-loop) is an edge joining a vertex to itself. 

The endpoints of an edge are the vertices that it joins. A loop has only one endpoint. 
An incidence rule specifies the endpoints of every edge. 

An edge e is incident with a vertex v if v is an endpoint of e. 

A graph is a set V of vertices and a set E of edges (both sets finite unless declared 
otherwise) such that all the endpoints of edges in E are contained in V. It is often 
denoted G = (V,E), or (Vg,Eg), or (V(G),E(G)). Sometimes, each edge is regarded 
as a pair of vertices. 

An isolated point of a graph G = (V, E) is a vertex in V that is not the endpoint of 
any edge in E. 

Vertices u and v are adjacent if there is an edge whose endpoints are u and v. 

Two edges are adjacent if they have a common endpoint. 

A neighbor of a vertex is any vertex to which it is adjacent. 

An attribute of the edge-set or vertex-set is a feature such as length, cost, or color 
sometimes attached to graphs. 

A graph model is a graph which (quite frequently, in applications) may have addi- 
tional attributes on its edges and/or vertices. The vertices and edges of the model may 
represent arbitrary objects and relationships from the context of the application. 

A weighted graph is a graph model in which each edge is assigned a number called 
the weight or the cost. 

A node is sometimes a synonym for a vertex and sometimes refers to whatever is 
modeled by a vertex in a graph model. 

A line is sometimes a synonym for an edge and sometimes refers to whatever is modeled 
by an edge in a graph model. 

A multi-edge in an undirected graph is a set of more than one edge with the same 
endpoints, and in a digraph a set of more than one edge such that each edge in the set 
has the same head and each edge in the set has the same tail. A graph with a multi-edge 
is also said to have multiple edges or parallel edges. 

A multi-edge of multiplicity n is a set of n edges with the same endpoints. 

A set of parallel edges is a set of edges with the same endpoints, i.e., a multi-edge. A 
pair of anti-parallel arcs is a pair of oppositely directed arcs between the same two 
endpoints. 

A graph is simple if it has no loops or multi-edges. Topologists often say simplicial , 
because such a graph is a special case of a “simplicial complex” . 

A multi-graph is another name for a graph with multi-edges but no self-loops, used 
for emphasis when the context is largely restricted to simple graphs. 

A pseudo-graph (or general graph) is another name for a graph in which loops and 
multi-edges are permitted, used for emphasis when the context is largely restricted to 
loopless graphs. 

A direction on an edge is an ordering for its endpoints so that the edge goes from one 
endpoint and to the other. Any edge, including a self-loop, can be directed by giving 
it a sense of forward progression, e.g., in a graph drawing, by placing an arrowhead to 
show which way is forward, or by ordering the endpoints. 

An arc is another name for a directed edge. 
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The tail of an arc is the vertex at which the arc originates. 

The head of an arc is the vertex at which the arc terminates. 

A directed graph or digraph is a graph in which every edge is directed. (See §8.3.) 
A strict digraph has no self-loops and no pair of arcs with the same tail and head. 

The degree of vertex v, deg(v), is the number of proper edges plus twice the number 
of loops incident with v. Thus, in a drawing, it is the number of edge-endings at v. 

The valence of a vertex is a synonym for degree adapted from terminology in chemistry. 

The degree sequence of a graph is the sequence of the degrees of its vertices, most 
often sorted into size order, ascending or descending. 

A regular graph is a graph such that all vertices have the same degree. It is called 
k-regular if the vertices all have degree k. 

A graphical sequence is a sequence of nonnegative integers that is the degree sequence 
of some simple graph. 

The number of vertices of a graph is sometimes called the order. 

The number of edges of a graph is sometimes called the size. 

The empty graph is the graph whose vertex set and edge set are both empty. This is 
also called the null graph (and is sometimes not considered to be a graph). 

The trivial graph is the graph with one vertex and no edges. 

Facts: 

1. The following table gives lists of some synonymous graph theory terms: 

vertex: point, node 
edge: line 
loop: self-loop 
neighbor: adjacent vertex 
arc: directed edge 
degree: valence 
number of vertices: order 
number of edges: size 

nonsimple graph: pseudograph, general graph 
loopless nonsimple graph: multi-graph 
empty graph: null graph 


2. The following table lists the varieties of graphs: 


graph variety 

loops allowed? 

multi-edges allowed? 

digraph 

YES 

YES 

general graph 

YES 

YES 

multi-graph 

NO 

YES 

pseudo-graph 

YES 

YES 

simple graph 

NO 

NO 

strict digraph 

NO 

NO* 


*at most one arc in each direction between two vertices 
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3. In a drawing of a graph, the degree of a vertex v equals the number of edge-ends at v. 
The degree of v need not equal the number of edges at v, since each loop contributes 
both its ends toward the degree. 

4. Euler's theorem : In every graph, the sum of the degrees equals twice the number of 
edges. From this result it follows that in all graphs the sum of the degrees of all vertices 
is even. 

5. In every graph the number of vertices of odd degree is even. 

6. The name handshaking lemma is commonly applied to various elementary results 
about the degrees of simple graphs, especially Facts 4 and 5. 

7. In every simple graph with at least two vertices, there is a pair of vertices with the 
same degree. 

8. Havel's theorem : A sequence is a graphical sequence if and only if the sequence 
obtained by deleting the largest entry d and subtracting 1 from each of the d next largest 
entries is graphical. (V. Havel, 1955). This operation on a sequence is called Havel’s 
reduction. 

9. A nonincreasing sequence of nonnegative integers d\, d%, . . . , d n is graphical if and 
only if its sum is even, and for k = 1,2, ... ,n, 

k n 

Y,di < k(k — 1) + rnin{k,di}. 

i=l i=k -\- 1 

(P. Erdos and T. Gallai, 1960) 

10. In a computer, a graph is commonly represented as a structure with variable value. 
The empty graph is often used as the initial value of a graph variable, analogous to the 
way in which zero is used as the initial value of a numeric variable. 

Examples: 

1. The following figure gives examples of the various varieties of graphs. (See the table 
of Fact 2.) 



graph 



simple graph multigraph 






digraph 



strict digraph 
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2 . Computer programming flowchart (always a digraph): Each vertex represents some 
programmed operation or decision, and each arc represents the flow of control to the 
next operation or decision. 



3 . Model for social networks (usually undirected): Each vertex represents a person 
in the network, and each edge represents a form of interaction between the persons 
represented by its endpoints. This is illustrated by the following graph. 


Anthony 



Margaret 


4 . Model for road networks (most edges undirected) : Each vertex represents either an 
intersection of two roads or the end of a dead-end street. The absence of an endpoint in 
the illustration indicates that the road continues beyond what is shown. Direction on 
an edge may be used to indicate a one-way road. Undirected edges are two-way roads. 



5 . In the following graph with vertex set V = { iq, V2, V3, V4, } and edge set E = 

{ ei, e- 2 , e 3 , e^, e$, eg, e-j }, the vertex v$ is an isolated point, and the degree sequence is 
(0, 3, 3, 4, 4). The edge er is a loop, and the three edges e 4 , e$, and ee form a multi-edge. 
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6. Deleting the isolated vertex and the self-loop in Example 5 and then reducing the 
multi-edge to a single edge yields the following simple graph, whose degree sequence is 
(1, 2, 2, 3). 


v 3 



7. One possible choice of edge directions for the graph of Example 5 yields this digraph. 

Anthony 



8. The sequence (1, 2, 2, 3, 4, 5) is not graphical, by Euler’s theorem (Fact 4), because 
its sum is odd. 

9. Havel’s reduction (Fact 8) of the sequence (2, 2, 2, 3, 4, 5) is (1, 1, 1, 2, 3). Havel’s 
reduction of that sequence is (0, 0, 1, 1). Since (0, 0, 1, 1) is the degree sequence of a 
graph with four vertices, two of which are isolated and two of which are joined by an 
edge, it follows from Havel’s theorem that the sequence (2, 2, 2, 3, 4, 5) is graphical. 


8.1.2 GRAPH OPERATIONS 
Definitions: 

A subgraph of a graph G = ( Vq,Eg ) is a graph H = ( Vh,E h ) whose vertex set and 
edge set are subsets of Vq and Eq, respectively, such that for each edge e in E H , the 
endpoints of e (as they occur in G) are in Vjy. 

A spanning subgraph of a graph G is a subgraph that contains all the vertices of G. 

The induced subgraph on a vertex subset S C Vq of a graph G is the subgraph 
whose vertex set is S and whose edge set contains every edge whose endpoints are in S. 
An induced subgraph of G is a subgraph H such that every edge of G that joins two 
vertices of H is also an edge of H. 

Deleting an edge e from a graph G results in the subgraph G — e that contains all 
the vertices of G and all the edges of G except for e. 

Deleting a subset Y of edges from a graph G results in the subgraph G — Y that 
contains all the vertices of G and all the edges of G except for those in Y . 

Deleting a vertex v from a graph G results in the subgraph G — v that contains all 
the vertices of G except v and all the edges of G except those incident with v. 

Deleting a subset S of vertices from a graph G results in the subgraph G — S that 
contains all the vertices of G except those in S and all the edges of G except those 
incident with vertices in S. 
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Contracting an edge e in a graph G means shrinking the edge to a point, so that its 
endpoints are merged, without changing the rest of the graph. The resulting graph is 
denoted G J, e (or G ■ e or G/e). To construct G [ e from G, delete the edge e from the 
edge set and replace all instances of its endpoints in the vertex set and incidence rule 
by a new vertex. 

A minor of a graph G is any graph that can be obtained from G by a sequence of edge 
deletions and contractions. 

The graph union G U H has as its vertices and edges those vertices and edges, respec- 
tively, that are either in G or in H . 

The graph intersection G D H has as its vertices and edges those vertices and edges, 
respectively, that are both in G and in H. 

The graph sum (or disjoint union ) G + H has as its vertices and edges, respectively, 
the disjoint union of the vertex sets and the disjoint union of the edge sets of the 
graphs G and H. 

The iterated graph sum nG is the union of n disjoint copies of G. 

The join G * H is obtained by adding to G + H an edge from each vertex in G to each 
vertex in H . 

The ( cartesian ) product G x H has as its vertices the cartesian product Vg x Vh and 
as its edges this union of two products: (Vg x Eh) U (Eg x Vh)- The endpoints of the 
edge (u, d) are the vertices (u, x) and (u, y), where x and y are the endpoints of d in H. 
The endpoints of the edge (e,w) are (u,w) and (v,w), where u and v are the endpoints 
of e. 

An isomorphism f:G — > H (of graphs) is a relationship between graphs that estab- 
lishes their structural equivalence. It is given by a pair of set bijections fv- Vg — ► Vh and 
/e - Eg —> Eh such that if u and v are the endpoints of edge e in graph G, then fv(u) 
and fv(v) are the endpoints of /s(e) in graph H. The vertex function and the edge 
function can both be denoted / without the subscript. (See §8.5.) 

Two graphs are isomorphic if there is an isomorphism between them. This means that 
they are essentially the same graph except for the names of their vertices and edges. 

A graph mapping f:G —> H (of graphs) is a pair of set functions fv-Vc — ► Vh and 
fs - Eq — > Eh such that if u and v are the endpoints of edge e in G, then f(u) and f(v) 
are the endpoints of /(e) in H . Such a pair of functions is said to preserve incidence. 
The vertex function and the edge function can both be denoted f without the subscript. 

A quotient of a graph G is a graph H such that there is a graph mapping from G 
onto H . 

An automorphism of a graph G is an isomorphism of G to itself. 

The automorphism group Aut(G) is the group of all automorphisms of graph G. 

Subdivision of an edge e is the operation of inserting a new vertex in the interior of 
an edge. Combinatorially, this is achieved by joining a new vertex to the endpoints of 
edge e and then deleting e. 

Two graphs are homeomorphic if there is a graph from which they can both be 
obtained by a sequence of edge subdivisions. 

The edge-complement G of a simple graph G (often, complement ) has the same 
vertex set as G, with every two distinct vertices being adjacent in G if and only if they 
are not adjacent in G. 
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A self- complement ary graph is a graph that is isomorphic to its complement. 

The line graph L(G) of a graph G has vertices corresponding to the edges of G, with 
two vertices being adjacent in L(G) whenever the corresponding edges are adjacent in G. 

Facts: 

1. If a graph J is isomorphic to a subgraph of a graph G, then it is commonly said 
that J “is” a subgraph of G, even though Vj and Ej might not be subsets of Vq and Eg, 
respectively. 

2. A graph is a subgraph of its union with any other graph. 

3. The intersection of two graphs is a subgraph of both of them. 

4. A graph mapping is the combinatorial counterpart of what is topologically a con- 
tinuous function from one graph to the other. 

5. A graph isomorphism is a graph mapping such that both the vertex function and 
the edge function are bijections. 

6. There is a self-complementary graph of order n if and only if n = 0 or 1 (mod 4) . 

7. The automorphism group Aut(G) of any simple graph is isomorphic to the auto- 
morphism group Aut(G) of its edge-complement. 

8. A connected graph G is isomorphic to its line graph if and only if G is a cycle 
(§8.1.3). 

9. If two connected graphs have isomorphic line graphs, then they are either isomorphic 
to each other or they are K 3 and (§8.1.3). 

10. Aut{ K n ) is isomorphic to the symmetric group S n (§5.3.1). 

11. Aut(C n ) is isomorphic to the dihedral group D n (§5.3.2). 


Examples: 

1. The dark subgraph spans the following graph, because it contains every vertex. 

EE2 

2. The cartesian product C 4 x K 2 (§8.1.3) is illustrated as follows: 

O x 

3. The join K 2 * P 3 (§8.1.3) is illustrated as follows: 
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4. The following two graphs are homeomorphic, but not isomorphic. 



5. The graphs It's . 3 and K 3 + K 3 (§8.1.3) are edge-complements of each other. 

6. The cycle graph C 5 (§8.1.3) is self-complementary. 

7. The line graph L^K^) (§8.1.3) is isomorphic to the octahedral graph 0 3 . 


8.1 .3 SPECIAL GRAPHS AND GRAPH FAMILIES 
Definitions: 

Note: Many of the following graphs are drawn in Figures 1 and 2. 

A graph is null (sometimes, empty ) if both its vertex set and edge set are empty. 

The bouquet of n loops, denoted B n , is a graph with one vertex and n self-loops. 

The dipole D n is the graph with two vertices and a multi-edge of multiplicity n joining 
them. 

The complete graph K n is a simple graph with n vertices in which every pair of 
vertices is adjacent. 

The n-path P n is a graph that “looks like” a path n — 1 edges long. It consists of a 
sequence of n vertices V = {v\,v? . . . ,v n } and a sequence of n— 1 edges joining successive 
vertices in the sequence, that is, the n — 1 edges are {v\,V 2 ~\, {V 2 , f 3 } , ■ ■ • , {v n -i, v n }. 

A path is a graph that is an n-path for some n > 0. 

The n- cycle C n is a graph that “looks like” a cycle. It has a “wraparound” sequence 
of vertices V = {vi,V 2 , ■ ■ ■ ,« n } and a sequence of edges joining successive vertices in 
the sequence, i.e. , the n edges are {"iq, {^ 2 > ^ 3 }) • • ■ , {i' n -i> v n}i {Vm t>i}- 

A cycle is a graph which is an n-cycle, for any n > 0. 

The n-wheel is the join of Ad and the n-cycle. 

A graph is bipartite if its vertices can be partitioned into two subsets (the parts, or 
partite sets) so that no two vertices in the same part are adjacent. 

The complete bipartite graph K r s is the simple bipartite graph in which the two 
parts have respective cardinalities r and s, such that every vertex in one part is adjacent 
to every vertex in the other part. 

The complete r -partite graph K ni ,n 2 ,...,n r has r disjoint subsets of vertices of orders 
ni, ri 2 , ■ ■ ■ , n r , with two vertices adjacent if and only if they lie in different subsets. If 
the r sets all have t vertices, this graph is sometimes denoted K r u\. 

The d-dimensional hypercube graph Qd is a graph with 2 d vertices that can be 
labeled with the 2 d bitstrings of length d so that two vertices are adjacent if and only 
if their labels differ in exactly one bit. 

A graph G is connected if for each pair of vertices in G, there is a path in G that 
contains them both. 

A tree is a connected graph without any cycles as subgraphs. (See Chapter 9.) 

A forest is a graph without any cycles as subgraphs. 
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The Kuratowski graphs are the graphs K§ and A 33 . 

The Petersen graph is the graph constructed from two disjoint 5-cycles u 0 , u\, U 2 , 
U 3 , U 4 and Vo,Vi,V 2 ,V 3 ,V 4 by adding an edge from Uj to V 2 j mod 5 , for j = 0, 1,2, 3, 4. 
It is usually drawn to look like a 5-pointed star inside a pentagon, so that each point of 
the star is joined to a corner of the pentagon. (See Figure 3.) 

A polyhedron is the generalization of a polygon to higher dimensions. Whereas a 
polygon is the intersection in 1Z 2 of several half-planes, an n-dimensional polyhedron is 
the intersection in lZ n of several half-spaces of dimension n. 

A Platonic solid is a regular 3-dimensional polyhedron. 

The 1-skeleton of a polyhedron is the graph that has as its vertices and edges the 
vertices and edges, respectively, of that polyhedron. 

The tetrahedral graph is the 1-skeleton of the 4-sided Platonic solid called a tetrahe- 
dron. The faces of the tetrahedron are triangles. 

The cube graph is the 1-skeleton of the 6-sided Platonic solid called a cube. The faces 
of the cube are squares. 

The octahedral graph is the 1-skeleton of the 8-sided Platonic solid called an octahe- 
dron. The faces of the octahedron are triangles. 

The generalized octahedral graph O n is the graph that can be obtained from the 
complete graph K 2 n by removing n mutually nonadjacent edges. 

The dodecahedral graph is the 1-skeleton of the 12-sided Platonic solid called a 
dodecahedron. It has 20 vertices, each of degree 3, and 30 edges. The dodecahedron 
has 12 regular pentagons as its faces. 

The icosahedral graph is the 1-skeleton of the 20-sided Platonic solid called an icosa- 
hedron. It has 12 vertices, each of degree 5, and 30 edges. The icosahedron has 20 
equilateral triangles as its faces. 

A Platonic graph is any graph isomorphic to the 1-skeleton of any Platonic solid. 

The intersection graph for a collection F = {£_,} of subsets of the same set has as 
its nodes the subsets themselves. There is an edge between each pair of subsets whose 
intersection is not empty. 

An interval graph is any graph isomorphic to the intersection graph for a collection 
of intervals of the real line. 

Facts: 

1. In a computer program, the null graph is used as the initial value of a graph- valued 
variable, rather like the way that an integer-valued variable is initialized to zero. As the 
program runs, the graph-valued variable can be modified by adding vertices and edges. 

2. Bouquets and dipoles are fundamental building blocks for graphs constructed by 
topological techniques. 

3. The path graph P n is a tree. 

4. A graph is bipartite if and only if it has no cycles of odd length. 

5. Trees are bipartite. 

6. The hypercube graphs Q n can be defined recursively as follows: Q 0 = K 1 , Q n = 
K '2 x Qn-i for n > 0. 

7. The hypercube graph Q n is bipartite and is isomorphic to the lattice of subsets of 
a set of n elements. (See §13.2.) 
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8. The octahedral graphs O n can be defined recursively as follows: O o = K 2 , O n = 
K 2 * O n - 1 for n > 0. 

9. There are exactly five Platonic solids: the tetrahedron, the cube, the octahedron, the 
dodecahedron, and the icosahedron. Their 1-skeletons are K±, Q 3 , 04 , the dodecahedral 
graph, and the icosahedral graph. 

Examples: 

1. Figure 1 shows some of the classes of graphs that occur most often in general con- 
structions. 

2. Figure 2 shows some of the graphs that occur most often as special examples. 


8.1 .4 GRAPH REPRESENTATION AND COMPUTATION 


To apply a computer to graph-theoretic computations, it is necessary to specify the 
underlying graph completely and without ambiguity. Programming system designers 
use specifications that are efficient for practical computation and that can be reused in 
additional applications. 

Definitions: 

A specification of a graph is a list of its vertices, a list of its edges, and an unambiguous 
description of the incidence rule for determining the endpoints of every edge. 

An endpoint table for a graph is a tabular description of the incidence rule, that gives 
the endpoints of every edge. In a digraph or partially directed graph, the tail and head 
of each directed edge are distinguished, for instance, by marking the head, or by always 
giving the tail first. 

An incident-edge table for a graph is a tabular description of the incidence rule, that 
gives for each vertex v, a list of the edges having v as an endpoint. If the graph is 
directed, this list is partitioned into two sublists, according to whether v is tail or head. 

A representation of a graph G is a graph description, such as a drawing, from which 
a formal specification could be constructed and labeled with the vertex names and edge 
names from G, so as to obtain a graph that conforms to the incidence rule for G. 

The incidence matrix of a graph (without loops) G with vertices v±,V 2 , . . . , v n and 
edges ei, e2, . . . , e m is the n x to matrix Mj (or Mj^q, in the context of more than one 
graph) with 



(Sometimes an incidence matrix is written with Mj[i, j] = 1 for a loop, even though 
this violates the properties that every column-sum equals 2 and every row-sum equals 
the degree of the corresponding vertex.) 

The adjacency matrix of a loopless graph G with vertices Vi, V 2 , ■ ■ ■ , v n and edges 
ei , e 2 , . . . , e m is the n x n matrix Aq with 

Ac[i,j] = the number of edges between Vi and Vj if i ^ j. 

If there are self-loops, then Ac[i,i] is usually defined to be the number of loops at Vi. 
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Figure 1 


Some fundamental infinite classes of graphs. 
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A Cayley graph [Cayley digraph] for a group A with generating set Xi, ... ,x r has 
as its vertices the elements of the group. For each group element b and each prescribed 
generator Xj, there is an edge between the vertices b and bxj [from b to bxj]. 

A Schreier graph [Schreier digraph] for a group A with generating set Xi,...,x r 
and subgroup B has as its vertices the cosets of B in A. For each coset Bb and each 
prescribed generator x v there is an edge between the vertices Bb and Bbxj [from Bb to 
Bbxj\. 

A permutation graph [permutation digraph] for a permutation group II with gen- 
erating permutations 7 Ti, . . . , 7iy and object set B has as its vertices the object set B. For 
each object b and each prescribed generator permutation irj, there is an edge between 
the vertices b and 7 ij{b) [from b to 7 Tj(6)]. 

An algebraic specification of a graph is a generalization of Cayley graphs and per- 
mutation graphs. It uses elements of a group or the objects of a permutation group 
as all or part of the names of the vertices and edges. It uses the group operation or 
permutation action in the incidence rule. 

A voltage graph is a form of algebraic specification in which the vertices and edges are 
specified as a set of one or more symbols with subscripts, ranging over group elements 
or permuted objects. Its usual form is a digraph drawing with vertex labels and edge 
labels. 

A normalized drawing of a graph represents each vertex as a distinct point in the 
plane and each edge as a possibly curved line between endpoints, obeying the following 
rules: 

• the interior of an edge may not cross through any vertex; 

• at most two edges cross at any point of the plane; 

• two edges cross each other at most once; 

• each edge crossing is normal, not a tangency. 

A complete set of operations on graphs is a set from which all other operations can 
be constructed. 

The operations in a complete set are primitive if none can be derived from the other 
operations in the set. 

A graph computation package is a computer software system that represents graphs 
and includes a complete set of operations. 

A display operation in a graph computation package manipulates the appearance of 
a graph image on a computer screen or in a drawing. 


Facts: 

1. Despite the redundancy of information, an incident-edge table is often used in com- 
bination with an endpoint table in computer software, because it facilitates the use of 
fast searching techniques at the cost of relatively little space. 

2. If a graph is simple, then its edges can be represented as endpoint pairs uv. Thus, 
the graph can be specified as a list of endpoint pairs and a list of isolated vertices. 

3. If a graph is simple, then its incident-edge table can be represented as a table that 
gives the list of neighbors of every vertex. 

4. If A = Aq is the adjacency matrix of graph G, then the (i. ©-entry of A k is the 
number of walks (§8.4.1) of length k from Vi to v :] in G. 
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5 . Matrix-tree theorem (Kirchoff, 1847): Let G be a graph, and let A be its adjacency 
matrix and D the diagonal matrix of the degrees of its vertices. Then the value of 
every cofactor of D — A equals the number of spanning trees of G. (Gustav R. Kirchoff, 
1824-1887) 

6 . Given the incidence matrix Mj t G, it is possible to obtain the incidence matrix Mjgi 
for a subgraph H of G by deleting all rows and columns corresponding to vertices and 
edges, respectively, that are not in the subgraph. 

7 . Algebraic specification is useful when the graph is highly symmetric. It replaces an 
arbitrarily large table of endpoints by a concise algebraic rule. 

8. Algebraic specification can be used to specify the graph model for a parallel archi- 
tecture for a computer. 

9 . Every regular graph of even degree can be specified as a Schreier graph or as a 
permutation graph. (J. Gross, 1977) 

10 . The graph specified by a voltage graph drawing is topologically a covering space 
of the voltage graph. (J. Gross, 1974) Thus, its relationship to the voltage graph is 
exactly like the relationship of a Riemann surface to the complex plane. 

1 1 . The most commonly used complete set of operations is adding a vertex, deleting a 
vertex, adding an edge, and deleting an edge. 

12 . Graph computation packages are built into mathematical computation systems 
such as Maple and Mathematica. 

13 . Graph computation packages often include display operations. 


Examples: 

1. The following normalized drawing, endpoint table, incident-edge table, incidence 
matrix and adjacency matrix all specify the same graph G. 
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2. The following normalized drawing, list of endpoint pairs, lists-of-neighbors table, 
incidence matrix and adjacency matrix all specify the same simple graph H. It is a 
spanning subgraph of the graph G of Example 1, but it is not an induced subgraph. 
Compare the incidence matrix to Example 1 to see how the rows and columns are 
deleted. 
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3. Squaring and cubing the adjacency matrix of Example 2 provides an illustration of 
Fact 4. 
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For instance, the three walks of length 3 from v 4 to v 3 are as follows: 

< V4, e 3 , v 2 , e 3 , v 4 , e 4 , v 3 >, < tq, e 4 , v 3 , e 4 , V4, e4, v 3 >, < V4, e4, v 3 , e 2 , v 2 ,e 2 , v 3 >. 


4. As an illustration of the Kirchoff matrix-tree theorem of Fact 5, observe that the 
graph of Example 2 has the following three spanning trees. 



The value of the (2,2)-cofactor of the matrix D — A is also equal to 3: 
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5. (a) The following normalized graph drawing and algebraic specification using the 
group Z§ of integers mod 5 both specify the same graph: 



algebraic specification using group Z 5 with generators x = 1 and y = 2 : 
vertex set V = { Vj \ j G Z 5 }; 

edge set E={xj joining Vj and uy+i) mo d 5 , Vj joining vj and u (i+ 2 ) mo d 5 I j € Z 5 }. 

(b) The following voltage graph (see [GrTu87]) provides a highly compact visual form 
of algebraic specification of the same graph as in part (a). 



6 . The Petersen graph has the following algebraic specification: 

V = {uj,Vj | j = 0,1, 2, 3, 4}; 

E= {Xj ( Uj * mod 5 ) 5 Vj iVj ^ ^(j+2) mod 5)5 ^ ) I j 0,1, 2, 3, 4}. 

With appropriate labeling, the Petersen graph (left) corresponds to the following voltage 
graph specification (right). 



7. The Petersen graph as depicted in Example 6 is also an example of a permutation 
graph on the object set V = { Uj,Vj \ j = 0, 1, 2 , 3, 4 } with the two permutations: 

( Uo,Ui,U2,U 3 ,U4)(vo,V2,V4,Vl,V 3 ) (u 0 , V 0 ) (ui ,l’i) (u 2 , V 2 ) (u 3 , V 3 ) (u 4 , V 4 ) ■ 


8.2 GRAPH MODELS 

Modeling with graphs is one of the main ways in which discrete mathematics has 
been applied to real world problems. This section gives a list of some of the ways 
in which graphs are used as mathematical models. Further information can be found in 
[HaNoCa65] and [Ro76]. 
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8.2.1 ATTRIBUTES OF A GRAPH MODEL 


Definitions: 

A mathematical representation of a physical or behavioral phenomenon is a corre- 
spondence between the parts and processes of that phenomenon and a mathematical 
system of objects and functions. 

A model of a physical or behavioral phenomenon is the mathematical object or function 
assigned to that phenomenon under a mathematical representation. 

Modeling is the mathematical activity of designing models and comprehensive math- 
ematical representations of physical and behavioral phenomena. 

A graph model is a mathematical representation that involves a graph. 

Examples: 

1. Table 1 gives many examples of graph models. Each example states what the vertices 
and edges (or arcs) represent and where in the Handbook details on the application can 
be found. 


8.3 DIRECTED GRAPHS 

Assigning directions to the edges of a graph greatly enhances modeling capability, and is 
natural whenever order is important, e.g., in a hierarchical structure or a one-way road 
system. Also, any graph may be viewed as a digraph, by replacing each edge with two 
directed edges, one in each direction. Many graph problems are best solved as special 
cases of digraph problems, for instance, finding shortest paths, maximum flows, and 
connectivity. 


8.3.1 DIGRAPH MODELS AND REPRESENTATIONS 

Most graph terminology applies equally well to digraphs, e.g., subgraph, self-loop, bi- 
partite, isomorphic, empty. The definitions below are special to digraphs or take on 
a somewhat different meaning for digraphs. In context, where it is clear that only di- 
graphs are being discussed, “directedness” is often an implicit attribute of an “edge”, 
“path”, and other terms. 

Definitions: 

A directed graph , or digraph, consists of: 

• a set V , whose elements are called vertices, 

• a set E , whose elements are called directed edges or arcs, and 

• an incidence function that assigns to each edge a tail and a head. 

The tail of an arc is the vertex it leaves, and the head is the vertex it enters. 

A strict digraph has no self-loops or multi-arcs. 

The underlying graph of a digraph is the graph obtained from the digraph by 
replacing every directed edge by an undirected edge. 
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Table 1 Directory of graph models. 


subject area 
and applica tion 

vertex attributes and meaning 
edge/arc attributes and meaning 

reference 

computer programming 

vertex labels are program steps 

§8.1.1 

flowcharts 

edge directions show flow 


social organization 

vertices are persons 

§8.1.1 

social networks 

edges represent interactions 


civil engineering 

vertices are road intersections 

§8.1.1, 

road networks 

edges are roads 

§8.3.1 

operations research 

vertices are activities 

§8.3.1 

scheduling 

arcs show operational precedence 


sociology 

vertices are individuals 

§8.3.1 

hierarchical dominance 

arcs show who reports to whom 


computer programming 

vertices are subprograms 

§8.3.1 

subprogram calling diagram 

arcs show calling direction 


ecology 

vertices are species 

§8.3.1 

food webs 

arcs show who eats whom 


operations research 

vertices are activities to be scheduled 

§8.3.1, 

scheduling 

edges are activity conflicts 

§8.6.1 

genealogy 

vertices are family members 

§8.3.1 

“family trees” 

arcs show parenthood 


set theory 

vertices are elements 

§8.3.1 

binary relations 

arcs show relatedness 


probabilistic analysis 

vertices are process states 

§8.3.2 

Markov models 

edges are state transitions 


traffic control 

vertices are intersection 

§8.3.3 

assigning one-way streets 

edges are streets 


partially ordered sets 

vertices are elements 

§8.3.4 

Hasse diagrams 

arcs show covering relation 


computer engineering 

vertices are computational nodes 

§8.4.2 

communications networks 

arcs are communications links 


operations research 

vertices are supply and demand nodes 

§8.4.2 

transportation networks 

arcs are supply lines 


walking tours 

vertices are land masses 

§8.4.3 

Seven Bridges of Konigsberg 

edges are bridges 


postal delivery routing 

vertices are street intersections 

§8.4.3 

Chinese Postman Problem 

edges are streets 


information theory 

vertices are binary strings 

§8.4.4 

Gray codes 

edges are single-bit changes 


radio broadcasting 

vertices are broadcast stations 

§8.6.1 

assignment of frequencies 

edges are potential interference 


chemistry 

vertices are chemicals 

§8.6.1 

preventing explosions 

edges are co-combustibility 
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subject area 
and application 

vertex attributes and meaning 
edge/arc attributes and meaning 

reference 

cartography 

regions are countries 

§8.6.4 

map-coloring 

edges are borders 


highway construction 

vertices are road intersections 

§8.7.1 

avoiding overcrossings 

edges are roads 


electrical network boards 

vertices are circuit components 

§8.7.1 

avoiding insulation 

edges are wires 


VLSI computer chips 

vertices are circuit components 

§8.7.4 

minimizing layering 

edges are wires 


information management 

vertices are data records 

§17.1.4 

binary search trees 

edges are decisions 


computer operating systems 

vertices are prioritized jobs 

§17.1.5 

priority trees 

edges are priority relations 


physical chemistry 

vertices are atoms 

§9.3.2 

counting isomers 

edges are molecular bonds 


network optimization 

edges are connections 

§10.1.1 

min-cost spanning trees 

edge-labels are costs 


bipartite matching 

parts are people and jobs 

§10.2.2 

personnel assignment 

edges are job-capabilities 


network optimization 

vertices are locations 

§10.3.1 

shortest path 

edge-labels are distances 


traveling salesman routing 

vertices are locations 

§10.7.1 

shortest complete tour 

edge-labels are distances 



The out-degree of vertex v, denoted <5 + (i>), is the number of arcs with tail at v. 

The in-degree of vertex v, denoted S~(v), is the number of arcs with head at v. 

A digraph D is transitive if whenever it contains an arc from u to v and an arc from v 
to w, it also contains an arc from u to w. 

The adjacency matrix Ad of a digraph I) is 

Ad = [ojj], where Oj j = number of arcs from u, to Vj. 

The incidence matrix Md of a digraph D with no self-loops is M d = [b-ij\, where 

{ +1, if Vi is the tail of e 3 but not the head 
—1, if Vi is the head of e 3 but not the tail 
0, otherwise. 

There is no standard convention for self-loops. 

Facts: 

1. Strict- digraph terminology: In a context focusing primarily on strict digraphs, there 
is often a different terminological convention: 

• “digraph” refers to a strict digraph; 

• a directed graph with multi-arcs is called a multidigraph ; 

• a directed graph with self-loops is called a pseudodigraph ; 

• an arc with tail u and head v is designated uv. 
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2. Alternative “path” terminology: There is an alternative convention in which a 
(directed) “path” may use vertices and arcs more than once, but an “elementary path” 
does not repeat arcs, and a “simple path” does not repeat vertices (and, hence, does 
not repeat arcs either). See §8.3.2. 

3 . The incidence structure of a digraph is frequently represented by an arc list , in which 
each arc is represented by an ordered pair uv, where u is its tail and v is its head. For 
each arc with tail u and head v, there is a separate entry, so that uv occurs as often as 
the number of such arcs. A list of the isolated vertices plus such an arc list completely 
specifies a digraph. 

4 . Another common specification of a digraph is the lists-of-neighbors representation. 
For each vertex u, there is a corresponding row, which has as an entry the head of each 
arc whose tail is u. Thus a vertex v occurs in that row as many times as there are arcs 
from u to v. 

5 . The incidence matrix is another common way to represent a digraph. Since all but 
one or two of the entries in every column are zero, the incidence matrix is a highly 
inefficient form of representation. 

6. The adjacency matrix is also a common way to specify a digraph in some contexts 
when there is no reason to identify the arcs by name. 

7 . A digraph can be represented by a 2 x \E\ incidence table in which the tail and head 
of each arc e appear in column e. Direction on an arc can be indicated by a convention 
as to whether tail or head appears in the first row, which requires swapping the two 
column entries if the direction is changed. Alternatively, direction can be indicated by 
marking one of the two entries in each column as the head, and then moving the marker 
if the direction changes. 

8. A row-sum in a directed adjacency matrix equals the out-degree of the corresponding 
vertex. A column-sum equals the in-degree. 

9 . In any digraph, the sum of the in-degrees, the sum of the out-degrees, and the 
number of edges are all equal to each other; i.e. , J2 v ev <j _ (u) = J2 v ev $ + { v ) = \E\- 


Examples: 


1. The following arc list, incidence table, list-of-neiglrbors, and adjacency matrix all 
represent the digraph G. 
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lists-of-neighbors : 


u : v, x 
v : v, w 
w : 0 

x : w, w, u 


adjacency matrix: 



u 

V 

w 

X 

u 

(o 

1 

0 

i\ 
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V 1 
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2 

0/ 

of the roads 

are c 


2. Civil Engineering: A road network in which at least sc 
way can be modeled by a digraph. The nodes are road junctures; each two-way road 
is represented by a pair of arcs, one in each direction. Loops are allowed, and they 
may represent “circles” that occur in housing developments and in industrial parks. 
Similarly, multiarcs may occur. 


3. Operations Research: A large project consists of many smaller tasks with a prece- 
dence relation — some tasks must be completed before certain others can begin. The 
vertices represent tasks, and there is an arc from u to v if task u must be completed 
before v can begin. For instance, in the following figure it is necessary both that food 
is loaded and the cabin is cleaned before passengers are loaded. 



unload 

passengers 


load passengers 


clean cabin 


• >% 

unload luggage load luggage 

4. Sociology and Sociobiology: A business (or army, or society, or ant colony) has a 
hierarchical dominance structure. The nodes are the employees (soldiers, citizens, ants) 
and there is an arc from u to v if u dominates v. If the chain of command is unique, 
with a single leader, and if only arcs representing immediate authority are included, 
then the result is a rooted tree. (See §9.1.2.) 

5. Computer Software Design: A large program consists of many subprograms, some 
of which can invoke others. Let the nodes of D be the subprograms, and let there be 
an arc from u to v if subprogram u can invoke subprogram v. Then the call graph D 
encapsulates all possible ways control can flow within the program. Directed cycles 
represent indirect recursion, and serve as a warning to the designer to ensure against 
infinite loops. See the following figure, where subprogram 2 can call itself indirectly. 
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6. Ecology: A food web is a strict digraph in which nodes represent species and in 
which there is an arc from u to v if species u eats species v. The following figure shows 
a small food web. 


frog spider < grackle 



beetle *■ cherry tree 

7. Operations Research: A sequence of books must be printed and bound, using one 
press and one binding machine. Suppose that book i requires time pi for printing and 
time bi for binding. It is desired to print the books in such an order that the binding 
machine is never idle: when it finishes one book, the next book should already be 
printed. The vertices of a digraph D can represent the books. There is an arc from 
book i to book j if pj < bj . Then any path through all the vertices corresponds to a 
permissible ordering. 

8. Genealogy: A “family tree” is a digraph where the orientation is traditionally given 
not by arrows but by the direction down for later generations. Despite the name, a family 
tree is usually not a tree, since people commonly marry distant cousins, knowingly or 
unknowingly. 

9. Binary relations : To any binary relation R on a set V (see §12.1) a digraph D(V. R) 
can be associated: the vertices are the elements of V , and there is an arc from u to v if 
(w, v) G R. Conversely, every digraph without multiple arcs defines a binary relation on 
its vertices. The relation R is transitive (see §12.1.2) if and only if the digraph D(V. R) 
is transitive. 


8.3.2 PATHS, CYCLES, AND CONNECTEDNESS 
Definitions: 

A directed walk is a sequence of arcs such that the head of one arc is the tail of the 
next arc. 

The length of a directed walk is the number of arcs in the sequence. 

A closed directed walk is a directed walk that begins and ends at the same vertex. 
A directed trail is a directed walk in which no arc is repeated. 

A directed path is a directed trail in which no vertex is repeated. 

A directed cycle is a closed directed trail in which no vertices are repeated, except 
the starting and stopping vertex. 

Vertex v is reachable from vertex u if there is a directed path from u to v. 

A basis for a digraph is a set of vertices V' such that every vertex not in V' is 
reachable from V' and such that no proper subset of V' has this property. 

The distance from a vertex u to a vertex v in a digraph D is the length of the shortest 
directed path from u to v . 

A digraph is strongly connected (or diconnected, or strong) if every vertex is 
reachable from every other vertex. 
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A digraph is unilaterally connected (or unilateral) if for every pair of vertices u 
and v, there is either a rtf-path or a un-path. 

A digraph D is weakly connected (or weak) if the underlying graph is connected. 
The digraph D is disconnected if the underlying graph is disconnected. 

A strong component of a digraph is a maximal subgraph that is strongly connected. 

A digraph D(V. E) is reducible if the vertex set V may be partitioned into a disjoint 
union V\ U V 2 so that all arcs joining Vi and V 2 go from Vj to Vj. 

The condensation D* of a digraph D is the strict digraph whose nodes are the strong 
components {Vi, Vj, . . « , Vk} of D , with an arc Vj Vj G Ed* if and only if there is an 
arc vv' in D such that v G Vj and v' G V r 

The converse of a digraph D is obtained by reversing the directions of all the arcs 
of D. 

The directional dual of a theorem about digraphs is the statement obtained by re- 
placing each property in the theorem statement by its converse. 


Facts: 

1. Using a pencil on a drawing of a digraph, a directed walk can be traversed by 
following the arrows without lifting the pencil from the graph. 

2. Distance in digraphs need not be symmetric. That is, the distance from u to v might 
be different from the distance from v to u. 

3. If A is the adjacency matrix of D , then the ij entry of A n is the number of ?r-arc 
walks from Uj to Vj. 

4. Let (5 + be the smallest out-degree of a strict digraph D. If S + > 0, then D has a 
cycle of length at least <5 + + 1. 

5. Let 5~ be the smallest in-degree of a strict digraph D. If S~ > 0, then D has a 
cycle of length at least S~ + 1. 

6. The directional dual of a theorem about digraphs is a theorem about digraphs. 

7. Fact 5 is the directional dual of Fact 4. 

8. A digraph D is Eulerian (§8.4.3) if and only if the underlying graph is connected 
and in-degree equals out-degree at every vertex. 

9. A digraph D has an Euler nu-trail (where ii/d) if the following conditions hold: 

• the out-degree of vertex u exceeds the in-degree by one; 

• the in-degree of v exceeds the out-degree by one; 

• at every other vertex, the in-degree equals the out-degree. 

That is, 

• d + (u) = d~(u) + 1; 

• d~(v) = d + (v) + 1; 

• (Vic ^ u,v) [ci - (ui) = d + ('u>)]. 

Other Euler-type results for graphs generalize to digraphs as well. 

10. Let <5 be the minimum of all in- and out-degrees of D. If D is strict and <5 > ^ > 1, 
then D contains a Hamilton cycle (§8.4.4). 

1 1 . Hamilton theory is much harder and less complete than Euler theory, for digraphs 
as for graphs. 
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12 . The strong components of a digraph D partition the vertices of D , but not the arcs, 
since some arcs go from one component to another. However, the maximal unilateral 
subgraphs do partition the arcs. If V \ , V 2 are the vertex sets of two strong components 
of D, then all arcs between V\ and Vi face the same way — either all are from V\ or all 
are to V\, See the following figure. 



13 . The condensation of any digraph is an acyclic digraph (§8.3.4). See the figure for 
Fact 12. 

14 . A digraph is reducible if and only if its condensation has at least two vertices. 

15 . A digraph is unilateral if and only if its condensation is a path. 

16 . A set V' is a basis of a digraph D if and only if V' consists of one vertex from each 

strong component of D that has in-degree 0 in D* . Thus, every basis of a digraph has 

the same number of vertices. 


17 . The eigenvalues of a digraph D are the union (counting multiplicities) of the eigen- 
values of its strong components. (See §8.9.3.) 


Examples: 

1. Let u,v be vertices of an n-vertex digraph D with adjacency matrix A. If v is 
reachable from u, then some wu-path has length < n — 1. Thus, D is strong if and only 

n— 1 

if every entry of ^ A k is positive. There are more computationally efficient tests for 

k—0 

diconnectivity: Warshall’s algorithm (§14.2) and directed depth first search (§13.3.2). 

2 . Let M be an arbitrary square matrix. Computation of the eigenvalues of Ad can 
sometimes be speeded up as follows. Create matrix A by replacing each nonzero entry 
of M by a ‘1’, and then let D be the digraph with adjacency matrix A. The eigenvalues 
of Ad are the union of the eigenvalues of the minors of Ad indexed by the strong com- 
ponents of D. (If one component has vertices ui,V 3 ,i> 7 , then one minor has rows and 
columns 1, 3, 7 of Ad.) If M is sparse (few nonzeros), then digraph D will usually have 
many small components and this approach will be efficient. 

3 . Markov models: Let V represent a set of states and E the possible transitions of 
a Markov process (§7.7). Then walks through D represent “histories” that the process 
can follow. 
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8.3.3 ORIENTATION 


There are many natural questions concerning when the edges of an undirected graph 
could be assigned directions so as to obtain a certain sort of digraph. For instance, 
when can a graph be oriented to obtain a strong digraph? An application of this last 
question is to determine when a set of roads could all be made one-way, while keeping 
all points reachable from all others. 

Definitions: 

An orientation of a graph is an assignment of directions to its edges, thereby making 
it a digraph. 

An orientation of a graph is strong if, for each pair of vertices u, v, there is a directed 
path from u to v and a directed path from v to u. 

An orientation of a graph is transitive if, whenever there is an arc from u to v and an 
arc from v to w, there is also an arc from u to w. 

A graph that admits a transitive orientation is called a comparability graph. 

A cut-edge (or bridge) of a graph is an edge whose removal would increase the number 
of components (§8.4.1). 

A 2-edge-connected graph G is connected and has no cut-edge. 

A generalized circuit in a graph is a closed walk (§8.4.1) that uses each edge at most 
once in each direction. 

A triangular chord for a closed walk (§8.4.1) ui,U 2 , ■ ■ ■ , Ufc, «i is a proper edge that 
joins two vertices exactly two apart on the walk. 


Facts: 

1. Let x(G) be the chromatic number (§8.6.1) of graph G. Then every orientation of 
G has a path of length at least x(G) — 1. 

2. A graph G has a strong orientation if and only if G is 2-edge-connected. (H. Robbins, 
1939) 

3 . A graph G is a comparability graph if and only if every generalized circuit of G of 
odd length > 3 has a triangular chord. 

4 . Algorithms 1 and 2 give ways of creating a strong orientation in a 2-edge-connected 
graph. 

Examples: 

1. In the figure below the digraph D is a weak transitive orientation of the graph G 
and D' is a strong nontransitive orientation. 
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Algorithm 1 : Naive algorithm for creating a strong orientation. 

{This algorithm is good to use by hand for small graphs.} 

input: a 2-edge-connected graph G 
output: a strong orientation of G 

H := any cycle in G 
direct H 

while some vertex of G is not in directed subgraph H 
v := a vertex not in H 
find two edge-disjoint paths from v to H 

{Two such paths exist because G is 2-edge-connected} 
direct one path from v to H and the other from H to v 
H := H with these two subgraph added 
orient any remaining edges arbitrarily 


Algorithm 2: Better algorithm for creating a strong orientation. 

{A good algorithm for large graphs or for computer implementation.} 

input: a 2-edge-connected graph 
output: a strong orientation 

select an arbitrary vertex as root 

construct the Depth-First-Search spanning tree from that root {See §9.2.2.} 
orient the tree edges downward from the root 
orient all back edges upward toward the root 
orient all cross edges arbitrarily 


2. The following graph is not transitively orientable, and x,u,v,y,v,w, z,w,u, x are 
the vertices of a generalized circuit without a triangular chord. 



3. Traffic control: The flow of traffic on crowded city streets can sometimes be im- 
proved by making streets one-way. When this is done, it is necessary that a car can 
travel legally between any two locations. Assigning directions to the edges of the graph 
representing the street grid is an orientation of this graph, and cars can travel legally 
between any two points if and only this graph has a strong orientation. Consequently, 
by Robbins’ theorem (Fact 2), to make all the streets one-way without losing mutual 
accessibility of locations, it is necessary and sufficient that the grid of streets be 2-edge- 
connected. 
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Algorithm 3: Naive topological sort. 

{Construct a linear extension ordering for a DAG.} 
input: a DAG D 

output: a numbering of the vertices in a topsort order 

H := D- k := 1 
while Vh 7 ^ 0 

Vk := a vertex of i? of in-degree 0 {This exists. See Fact 1.} 
Ft := H — Vk {Remaining graph is still a DAG.} 

A: := k + 1 


8.3.4 DIRECTED ACYCLIC GRAPHS 
Definitions: 

A digraph is acyclic if it has no directed cycles. A directed acyclic graph is sometimes 
called a DAG. 

A source of a digraph is a vertex of in-degree zero. 

A sink of a digraph is a vertex of out-degree zero. 

A linear extension ordering of the n vertices of a digraph is a consecutive labeling 
i>i, i> 2 , ■ ■ ■ ,v n so that, if there is an arc from Vi to Vj, then i < j. (See also §11.2.5.) 

A topological sort, or topsort, is an algorithm that assigns a linear extension ordering 
to a DAG. This traditional name belies the facts that it is not quite a sort, in the usual 
sense of sorting, and that its relation to topology (in the sense understood by topologists) 
is obscure. 

Facts: 

1. Every DAG has at least one source, and by duality, at least one sink. 

2. Every DAG has a unique basis (§8.3.2), namely, the set of all its sources. 

3. Topsort yields a linear ordering for the vertices that makes the adjacency matrix of 
a DAG upper-triangular. 

4. Doing a preliminary topsort permits many optimization problems about paths to 
be solved subsequently by a single algorithmic pass through the vertices in the topsort 
order; see §15.2.2 (dynamic programming) and §15.5 (critical paths). See Algorithm 3. 

Examples: 

1. In the following digraph vertex w is a source and vertex 2 is a sink. It is a DAG, 
even though the underling graph has cycles. Labeling the vertices either in the order 
w, x, y, z or w, y, x,z is a linear extension ordering. 
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2 . Consider any digraph whose vertices represent discrete events, and whose arcs go 
from earlier events to later events. Any such digraph is acyclic. Conversely, any digraph 
whose vertices represent procedural steps and whose arcs represent required precedence 
can be scheduled (using a topological sort) so that arcs do in fact go forward in time. 

3 . The Hasse diagram of a poset (§12.3.5) is a DAG, as is the entire graph of a poset 
(arc from u to v if and only if u > v). 


8.3.5 TOURNAMENTS 
Definitions: 

A tournament is a digraph with exactly one arc between each pair of distinct vertices. 
An n-tournament has n vertices. 

The score vector of a tournament is the sequence of out-degrees of the vertices (number 
of arcs leaving each vertex), usually in ascending order. 

A tournament T is regular if every vertex has the same outdegree. 

A tournament T is strong if there is a directed path between each pair of vertices in 
both directions. 

A tournament T is transitive if, whenever there is an arc from u to v and from v to w, 
there is also an arc from u to w. 

A tournament T is irreducible if there is no bipartition Vi, V 2 of the vertices such that 
all arcs between Vi and V 2 go from Vi to V 2 . 

Vertex u of a tournament dominates vertex v if there is an arc from u to v. 

A transmitter in a digraph is a vertex that has an arc to every other vertex. 

A king in a digraph is a vertex from which there is a path of length 1 or 2 to all other 
vertices. 

A single-elimination competition is a contest from which a competitor is eliminated 
after the first loss. 

Facts: 

1. Every tournament has a Hamilton path (§8.4.4), in fact an odd number of them. 

2 . The following statements are equivalent for any n-tournament T: 

• T is strong; 

• T is irreducible; 

• T has a Hamilton cycle (§8.4.4); 

• T has cycles of all lengths 3,4 , ,n.; 

• Every vertex of T is on cycles of all lengths 3,4 , ,n. 

3 . Almost all tournaments are strong, in the sense that, as n — > 00 , the fraction of 
labeled n-tournaments that are strong approaches 1. 

4 . The following are equivalent for a tournament: 

• the tournament is transitive; 

• the tournament contains no cycles; 

• the tournament contains no 3-cycles; 

• the tournament is a total (i.e. linear) order; 

• the tournament has a unique Hamilton path. 

5 . Every tournament has a king. 
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6. The king of a tournament is unique if and only if it is a transmitter. Otherwise, 
there are at least three kings. 

7. In a large tournament, almost every vertex is a king, for as n — > oo, the fraction of 
n-tournaments in which every vertex is a king approaches 1. 

8. Score vector characterizations : A nondecreasing sequence S of nonnegative integers 
Si, S2, • ■ . , s n is the score vector of an n-tournament if and only if 

k n 

E > ( *)> for k = 1,2, . . . ,n-l, and E s* = (”), 

i=l i — 1 

or equivalently, if and only if 

the sequence S' obtained by deleting any one s t and reducing the largest re- 
maining n — Si — 1 terms by 1 is a score vector of an (n— l)-tournament. 


9. The second characterization of Fact 8 leads to a recursive algorithm to construct a 
tournament having a specified score vector. See Example 4. 

10. A nonnegative integer sequence si < S2 < . . . < s n is the score vector of a strong 
n-tournament if and only if 

k n 

E s i>( 2 )> for fc = 1 , 2 , ... ,n-l, and E s, = Q). 

2=1 2=1 


11. There are 2 ( 2 ) distinct labeled tournaments, because for each pair of vertices {u, v}, 
there are two choices which way to direct the edge. If c n is the numbered of distinct 
unlabeled n-tournaments, then 


> 


2(2) 

n! 


and lim 


= 1 . 


2 (3)/n! 

The distinction between labeled and unlabeled tournaments is the same as between 
labeled and unlabeled graphs; see §8.9.1. The two tournaments in the following figure 
are isomorphic as unlabeled tournaments, but distinct as labeled tournaments. 


AAA 

x y z x y x 

(a) (b) (c) 

12. Banking real tournaments: When a tournament models a competition, there is 
an obvious desire to rank the teams, or at least to pick a clear winner. Many ranking 
methods have been proposed, and continue to be proposed. 

13. When a tournament is acyclic, it corresponds to a unique total ordering (Fact 4), 
so the ranking is unequivocal. However, almost all tournaments are strong (Fact 3). 
Moreover, in a large tournament, almost every vertex is a king (Fact 7). These are 
reasons why it is considered difficult to give a satisfactory general method to rank 
tournaments. 

14. Scheduling tournaments: To speed up the play of an n-tournament, games can be 
scheduled in parallel. If n is even, then at most ” of the Tl(rt ~ 1 ^ games may be played 
at once, so at least n — 1 rounds are needed. However, if n is odd, then only n i 2 [ games 
can be played at once, so at least n rounds are needed. In fact, this minimum number 
of rounds can be obtained, and several methods of scheduling tournaments, subject to 
various additional conditions, have been devised. See [M068]. 
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Examples: 

1. A round-robin sports tournament in which there are no ties is a tournament in 
the mathematical sense defined above. However, a single-elimination competition (e.g., 
most tennis tournaments) is not a tournament as defined above. 

2 . It has been observed that in every small flock of hens, every pair of hens establish 
a dominance relation — the weaker of the two allows the stronger to peck her. Thus, 
this pecking order is a tournament. 

3 . In a “paired comparison experiment”, a subject is asked to state a preference in 
each pair chosen from n items. This amounts to a tournament, where there is an arc ij 
if item i is preferred to item j. 

4 . Is there a tournament on vertices (a,b,c,d,e) with respective scores (1,2,2, 2,3)? 
Deleting e according to the second part of Fact 8 leaves vertices (a, b, c, d) with scores 
(1, 2, 2, 1). Next deleting d leaves (a, b, c) with scores (1, 1, 1). The obvious tournament 
with such a score vector is a 3-cycle. Next reinsert vertex d , making it dominate vertex a 
only. Then reinsert vertex e, making it dominate a , b , c. This 5-tournament has the 
specified score vector (1, 2, 2, 2, 3) . 

5 . Ranking real tournaments: Ranking teams by their order along a Hamilton path (see 
Fact 1) is rarely satisfactory, because that order is unique only for transitive tournaments 
(Fact 4); in most cases, there are a great many Hamilton paths. Ranking by score vector 
usually creates ties, and a team with few wins may deserve a better rank if those teams 
it beats have many wins. So one may consider the second-order score vector, where 
each team’s score is the sum of the out-degrees of the teams it beats. This can be 
continued to nth-order score vectors. There is a limit ranking obtained this way (often 
quite satisfactory), related to the eigenvalues of the digraph. See [Mo68] for more detail 
and references. 


8.4 DISTANCE, CONNECTIVITY, TRAVERSABILITY 

Movement from one node to another in the network corresponds to the graph-theoretic 
notion of a walk. Graphs often serve as models for transportation and communication 
network problems. The capability for any two nodes in a network to communicate 
corresponds to connectedness. The connectivity of a graph is a measure of resistance to 
a communications cutoff. 


8.4.1 WALKS, DISTANCE, AND CYCLE RANK 
Definitions: 

A walk in a graph is an alternating sequence vq, e±, v \, . . . , e r , v r of vertices and edges 
in which each edge joins vertices i>j_i and i>i. Such a walk is also called a vq, v r -walk. 

The length of a walk is the number of occurrences of edges in it. An edge that occurs 
more than once is counted each time it occurs. 

A trail is a walk in which all of the edges are different. 

A path is a trail in which all the vertices are different, except that the initial and final 
vertices may be the same. A path from vq to v r is called a vo,v r -path. 

A walk, trail, or path is open if its final vertex is different from its initial vertex. 
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A walk, trail, or path is closed if its final vertex is the same as its initial vertex. 

A graph is connected if each pair of vertices are joined by a path. 

A component of a graph is a maximal connected subgraph of the graph. 

The vertex v is reachable from vertex u in a graph if there is a u, n-path in the graph. 
An isolated vertex of a graph is a vertex with no incident edges. 

The distance d(v, w) between two vertices v and w of a graph is the length of a shortest 
path between them, with d(v,v) = 0 and d(v,w) = oo if there is no path between v 
and w. 

The diameter of a connected graph is the maximum distance between two of its ver- 
tices. 

The eccentricity of a vertex v of a connected graph is the greatest distance from v to 
another vertex. 

The radius of a connected graph is the minimum eccentricity among all the vertices of 
the graph. 

The center of a connected graph is the set of vertices of minimum eccentricity. 

A cycle is a closed path of positive length. (The word “cycle” also refers to a type of 
graph; see §8.1.3.) 

The cycle rank (or first Betti number ), denoted by /3±(G), of a connected graph 
G=(V,E) is \E\ — \V\ + 1. 

Facts: 

1. Alternative terminology: Sometimes “path” is used to mean what is here called a 
trail, in which case “simple path” is used to mean a path. 

2. In a simple graph, a walk may be represented as a string of vertices vqV\ . . .v r , 
without mentioning the edges. 

3. The distance function on the vertex set of any connected graph G is a metric; i.e., 
the following rules hold for all vertices u, v, and w in G: 

• d(y, w) > 0, with equality if and only if v = to; 

• d(w, v) = d(v, w ); 

• d(u,w) < d(u,v) + d(v,w), with equality if and only if v is on a shortest path 

from u to w. 

4. There are polynomial-time algorithms for finding a shortest path between vertices. 
(See §10.2.) 

5. A graph is connected if and only if it has a spanning tree. 

6. The graph G is nonconnected if and only if there is a partition of its vertex set into 
nonempty sets A and B so that no edge has one end in A and the other in B. 

7. The relation “is reachable from” is an equivalence relation on the vertex set. The 
equivalence classes of this relation induce the components. 

8. The graph G is connected if every vertex is reachable from every other vertex. 

9. In a simple graph, the minimum length of a cycle is at least 3. In a general graph, 
a self-loop is a 1-cycle, and a 2-cycle is formed by a pair of vertices joined by a pair of 
parallel edges (§8.1.1). 

10. The cycle rank (3i(G ) of a connected graph G is best conceptualized as the number 
of edges remaining in the complement of a spanning tree for G, and not as an abstract 
formula. 
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11 . The cycle rank 8\ ( G) of a connected graph G is equal to the rank of a vector space 
over Z 2 whose domain is the set of cycles of G. 

12. The cycle rank of a connected planar graph G equals the number of regions in a 
plane drawing of G , minus the exterior region. 

13. The following table gives the cycle rank of some infinite families of graphs. 


graph 

cycle rank 

bouquet B n 

n 

dipole D n 

n — 1 

complete graph K n 

(n— 2)(n— 1) 

2 

complete bipartite graph K m ^ n 

(m — 1) (n — 1) 

cycle graph C n 

1 

wheel W n 

n 

hypercube Q n 

(n - 2) 2" + 1 

any tree 

0 


Example: 

1. The following connected graph has diameter 3 and radius 2. The vertices in its 
center are indicated by solid dots. 



2. The cycle rank of the following connected graph is three. Observe that there are 
three edges in the complement of the indicated spanning tree. 



3. The following nonconnected graph has three components, one of which is an isolated 
vertex. 






8.4.2 CONNECTIVITY 
Definitions: 

A cut-vertex of a graph G (or cut-point or articulation point ) is a vertex v such 
that G — v has more components than G. (In topological analysis of nonsimple graphs, 
sometimes a vertex attached to a self- loop is also considered to be a cut vertex.) 

A nonseparable graph is a connected graph with no cut-vertices. 


© 2000 by CRC Press LLC 






A block of a graph is a maximal nonseparable subgraph. 

An cut-edge of a graph G is an edge e such that G — e has more components than G 
(in which case there is just one more). 

A disconnecting set of vertices in a connected graph is a set of vertices whose 
removal yields a nonconnected graph. 

A disconnecting set of edges in a connected graph is a set of edges whose removal 
yields a nonconnected graph. 

The zeroth Betti number /3o(G) of a graph G is the number of components in G. 
Elsewhere this is sometimes is denoted by c(G) or u>(G). 

The ( vertex ) connectivity k(G) is the number of vertices in the smallest disconnecting 
set of vertices. By convention, n(K n ) = n — 1. 

The edge connectivity k'(G) is the number of edges in the smallest disconnecting set 
of edges. 

A graph is k-connected if k(G) > k. 

A graph is k-edge-connected if «'(G) > k. 

Facts: 

1. A vertex is a cut- vertex if and only if it lies on all paths between two other vertices. 

2 . Every nontrivial graph has at least two vertices that are not cut- vertices. 

3. An edge is a cut-edge if and only if it is not contained in any cycle. 

4 . For any edge e of a graph G, (3o(G) + 1 > /?o(G — e) > (3o (G). 

5 . For any vertex v of a graph G, /3q(G — v) > Po(G); however, /3 0 (G — v ) may be 
arbitrarily greater than /3o(G). 

6 . Let G be a 2-connected graph. Then for any two vertices, there is a cycle containing 
those vertices. 

7 . Let G be a 2-connected graph. Then for any two edges, there is a cycle containing 
those edges. 

8 . The following statements are equivalent for a connected graph G with at least three 
vertices: 

• G is nonseparable; 

• every pair of vertices lie on a cycle; 

• every pair of edges lie on a cycle; 

• given any three vertices u , v, and w, there is a path from u to w containing v; 

• given any three vertices u, v, and w, there is a path from u to w not containing v. 

9 . Menger’s theorem ( for vertex connectivity ): A graph with at least k + 1 vertices is 
fc-connected if and only if every pair of vertices is joined by k paths which are internally 
disjoint (i.e., disjoint except for their origin and terminus). (Menger, 1927) 

10 . Menger’s theorem ( for edge connectivity ): A graph is /c-edge-connected if and 
only if every pair of vertices is joined by k edge-disjoint paths. (Ford and Fulkerson, 
1956; also Elias, Feinstein, and Shannon, 1956) 

1 1 . For any graph G, the vertex connectivity is no more than the edge connectivity, and 
the edge connectivity is no more than the minimum degree. That is, k(G) < k'(G) < 
^min(G), where <5 m in(G) denotes the minimum degree. 

12. Furthermore, for any positive integers a > b > c, there exists a simple graph G for 
which <5 m in(G) = a, n' (G) = 6, and k(G) = c. (Chartrancl and Harary, 1968) 
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13 . The following table gives the vertex connectivity and edge connectivity of some 
infinite families of graphs. 


graph 

tv 

tv 

complete graph K n 

n — 1 

n — 1 

complete bipartite graph K m>n 

min (to, n) 

min (m, n ) 

cycle graph C n 

2 

2 

wheel W n 

3 

3 

hypercube Q n 

n 

n 

any nontrivial tree 

1 

1 


Examples: 

1. The following graph G has cut- vertices u and v. The blocks are illustrated at the 
right. 



2 . In the following graph, vertices u and v form a disconnecting set. 



3 . Communication networks: A communication network can be modeled as a graph 
with vertices representing the nodes and with undirected edges representing direct two- 
way communications links between nodes. In order that all pairs of nodes be in com- 
munication, the graph must be connected. Vertex connectivity and edge connectivity 
are measures of network reliability. 

4 . Transportation networks: Low connectivity in transportation networks results in 
“bottlenecks” , in which many different shipments must all past through a small number 
of vertices. High connectivity implies (by Menger’s theorem) several alternative routes 
between nodes. 

5 . Menger’s theorem implies that a 2-connected graph has two disjoint paths between 
each pair of vertices. It does not imply that for any path between two vertices, there 
must be a second such path disjoint from the first, as indicated in the following graph. 
There are two disjoint paths from the leftmost vertex to the rightmost, but there is no 
such path disjoint from the one indicated by thick edges. 
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6 . The following shows a graph G with n(G) = 2 and n'(G) = 3. On the left there 
are the two internally-disjoint paths between the upper-left vertex and the lower-right 
vertex, and on the right there are three edge-disjoint paths. 



7. The following graph illustrates Fact 12, with k = 2, k' = 3, <5 m i n = 4. 



8.4.3 EULER TRAILS AND TOURS 
Definitions: 

An Euler trail in a graph [digraph] is a trail that contains all the edges [arcs] of the 
graph. 

An Euler tour or Euler circuit in a graph or digraph is a closed Euler trail. 

A graph or digraph is Eulerian if it has an Euler tour. 

Facts: 

1 . Seven bridges of Konigsberg problem : In Kaliningrad, Russia, two branches of the 
River Pregel meet and flow past an island into the Baltic Sea. In 1736, when this was 
the town of Konigsberg in East Prussia, there were seven bridges joining the banks of 
the river, the headland, and the island, as illustrated below at the left. The celebrated 
Swiss mathematician Leonhard Euler (1707 -1783) was invited by Emperor Frederick 
the Great to decide whether it was possible to cross all seven bridges without recrossing 
any of them. In the earliest known paper on graph theory, Euler proved it is impossible, 
because the graph at the right has no Euler trail. 



2. Euler’s work on the seven bridges of Koningsberg problem is commonly described 
as the founding of graph theory and also as the founding of topology. 
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3 . A connected graph is Eulerian if and only every vertex has even degree. The tour 
may begin/end at any vertex. 

4 . A connected digraph has a directed Euler tour if and only if the in-degree of every 
vertex v equals its out-degree. 

5 . A connected graph has an Euler trail between distinct vertices u and v if and only 
if u and v are the only vertices of odd degree. 

6. A connected graph (digraph) is Eulerian if and only if there exists a collection of 
cycles (directed cycles) whose edges partition the edge set of the graph. 

7 . A connected planar (§8.7.1) graph is Eulerian if and only if its dual (§8.8.2) is 
bipartite. 

8. A graph G can be oriented to have a directed Euler tour if and only if it is an 
Eulerian graph. (Traversing an Euler tour provides an orientation.) 

9 . The following table tells which members of several infinite families of graphs are 
Eulerian. 


graph 

Eulerian? 

bouquet B n 
dipole D n 
complete graph K n 
complete bipartite graph K m n 
cycle graph C n 
wheel W n 
hypercube Q n 
tree 

for all n 

for even n 

for odd n 

for m and n both even 

for all n 

never 

for even n 
only if trivial 


10 . Algorithm 1 gives a recursive method for finding an Eulerian tour on an Eulerian 
graph. 

11 . Fleury’s algorithm for finding an Euler tour or trail is given in Algorithm 2. 

Examples: 

1. The following is an Eulerian graph and one of its Euler tours. 



2. Chinese postman problem (due to Guan Meigu, 1962): A letter carrier begins at 
the post office, traverses every street in his territory at least once, and then returns 
to the post office. His objective is to walk as little as possible. Each edge of a graph 
representing the street configuration is labeled with the length of the corresponding 
block. If the graph is Eulerian, then an Euler tour gives an optimal solution. Otherwise, 
some edges must be retraced. Polynomial-time algorithms to solve this problem are 
known. See §10.2.3. 
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Algorithm 1 : Recursive algorithm for finding an Eulerian tour. 

input: a connected graph G, all of whose vertices have even degree 
output: an Euler tour of G 

C := a cycle in the graph G; place G on the cycle-queue Q 

partition the edge-complement G — E(C) into components Hi, H 2 , • • • , Hk 

recursively run this algorithm on each component Hi 

{So far, Eq has been completely partitioned into the cycles on Q} 

merge the elements of Q into an Euler tour for G, by traversing the cycle C 
and splicing in the tours found for the components Hi whenever possible 


Algorithm 2: Fleury’s algorithm for finding an Euler tour/trail. 

input: a connected graph G, an initial vertex v, and a final vertex w; if v yf w, 
then every vertex except v and w must have even degree (if v = w, then all 
degrees must be even) 

output: an Euler trail whose origin is v and whose terminus is w 
{ find trail edge with origin v } 

if deg(y ) > 1 then e := any edge incident at v which is not a cut-edge 
else { deg(v ) = 1 } 

e := the unique edge incident at v 
u := the other endpoint of e 
recursively find an Euler trail from u to w in G — e 
prepend the edge e to the trail found in the recursive step 
{ This yields the required Euler trail of G. } 


3. For every letter in an arbitrary n-letter alphabet A, there is a string starting and 
ending with that letter, in which every possible substring of two letters appears con- 
secutively exactly once. To see this, consider the digraph D with the letters of A as 
vertices and one arc for each ordered pair. The digraph D is connected and, at each 
of the n vertices, in-degree = out-degree = n, which implies that it is Eulerian. Thus, 
the sequence of vertices encountered on a closed Euler tour from any vertex to itself 
yields the specified string. See the following figure, where e\, e 2 , ■ ■ • , eg are the arcs of 
an Euler cycle and the associated string is the sequence of vertices, aabbccacba. This 
result generalizes to substrings of any fixed length, also using Euler tours. 
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8.4.4 HAMILTON CYCLES AND PATHS 


Definitions: 

A Hamilton cycle in a graph [digraph] is a cycle [directed cycle] that includes all 
vertices of the graph. 

A graph or digraph is Hamiltonian if it contains a Hamilton cycle. 

A Hamilton path in a graph [digraph] is a path [directed path] that includes all vertices 
of the graph. 

A theta graph is a subdivision of the complete bipartite graph AT 2 , 3 • Thus, the graph 
comprises three internally disjoint paths joining the two 3- valent vertices. 

A Gray code is a cyclic ordering of all 2 fc length-fc bitstrings, such that each bitstring 
differs from the next in exactly one bit entry. 

A tough graph is a connected graph G such that no matter what nonempty, proper 
vertex subset S is removed, the resulting number of components of G — S is no more 
than |Sj. 

Facts: 

1. The concept of a Hamilton cycle first arose in a puzzle within the Icosian Game, 
invented by Sir William Rowan Hamilton (1805-1865), an Irish mathematician. This 
puzzle involved a dodecahedron whose 20 vertices were labeled with world capitals. It 
required finding a complete tour of these 20 capitals. 

2 . The recognition problem for Hamiltonian graphs is NP-complete. Thus, unlike the 
case with Eulerian graphs, there is no easy test to decide whether a graph is Hamiltonian 
(unless P = NP). However, many of the following facts provide criteria that are often 
helpful in deciding. 

3 . A Hamiltonian graph has no cutpoint. (Thus, it is 2-connected.) 

4 . The previous fact has this generalization: Let G be a Hamiltonian graph and let 
S C Vg- Then the graph G — S has at most [S'] components. 

5 . Bipartite Hamiltonian graphs have an equal number of vertices in the two parts of 
the bipartition. 

6. If a simple graph has n > 3 vertices and minimum degree at least then it is 
Hamiltonian. (Dirac, 1952) 

7 . If a simple graph has n > 3 vertices, and if every pair of nonadjacent vertices u 
and v satisfies the inequality deg(u) + deg(v) > n, then it is Hamiltonian. (Ore, 1960) 

8 . Suppose that a simple graph with n > 3 vertices has degree sequence d\ < ci 2 < 

■ ■ ■ < d n , and that for every i with 1 < i < § either di > i or d n -i > n — i. Then the 
graph is Hamiltonian. 

9 . Every simple graph with n > 3 vertices and at least (n 2 — 3n + 6)/2 edges is 
Hamiltonian. 

10 . Every graph with at least three vertices whose connectivity ( n ) is at least as large 
as its independence number (cc) is a Hamiltonian graph. 

11 . Every 4-connected planar graph is Hamiltonian. 

12 . If the edges of the complete graph I\ n are assigned directions, then the resulting 
digraph always has a Hamilton directed path. 
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13 . The edges of the complete graph K 2n + 1 can be partitioned into n Hamilton cycles. 

14 . A theta graph looks like a subdivided copy of the Greek letter theta (6). 

15 . Theta graphs are non-Hamiltonian. 

16 . Every nonHamiltonian graph contains a theta subgraph. 

17 . A graph G is non-Hamiltonian if there is a subset of mutually nonadjacent vertices 
containing more than half the vertices of G. 

18 . Hamiltonian graphs are tough. 

19 . “Almost all” graphs are Hamiltonian. That is, of the exactly 2™( n_1 )/ 2 simple 
graphs on n (labeled) vertices, the proportion that are Hamiltonian tends to 1 as n — ■> oo. 

20. Suppose that a simple graph is constructed by the following process: start with n 
vertices and no edges; until the minimum degree is 2, a possible edge is chosen uniformly 
at random from among the edges not already in the graph, and added to the graph. 
With probability tending to 1 as n — » oo, the resulting graph is Hamiltonian. 

21 . The following table tells which members of several infinite families of graphs are 
Hamiltonian. 


graph 

Hamiltonian ? 

bouquet B n 

for all n > 1 

dipole D n 

for all n > 2 

complete graph K n 

for all n > 3 

complete bipartite graph K. mn 

when m = n 

cycle graph C n 

for all n > 1 

wheel W n 

for all n > 2 

hypercube Q n 

for all n > 2 

any tree 

X\V\ = 1 


Examples: 

1. Finding a Hamilton cycle in the dodecahedral graph (see §8.2.3), as illustrated below, 
is equivalent to solving Hamilton’s Icosian Game puzzle. An example of a Hamilton cycle 
in this graph is: RSTVWXHJKLMNPCDFGBZQR. 

R 


S 


2 . The following graph has the Hamilton cycle acefdba. 

b 


f 
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3. The following graph is non-Hamiltonian, by Fact 17, since the vertices u , v. w, and x 
are mutually nonadjacent. 


r s t 



4. The 10-cycle Cio (§8.1.3 Figure 1) is an example of a graph that satisfies none of the 
sufficient conditions in Facts 7-9 above for Hamiltonicity, but is nonetheless Hamiltonian. 

5. The traveling salesman problem (§10.7.1) is to find a minimum-cost Hamilton cycle 
in a complete graph whose edges are labeled with costs. 

6. Information theory — Gray codes : In information theory, a cyclic ordering of the 2™ 
length-n bitstrings such that each bitstring differs from its predecessor in exactly one 
bit is called a Gray code. This corresponds to a Hamilton cycle in the fc-dimensional 
hypercube. 

The following figure shows a Hamilton cycle in the 3-cube giving the Gray code 
000 -► 001 -> 011 -► 111 -> 101 -► 100 -► 110 -► 010 -► 000 . 



7 . The Petersen graph (§8.1.3 Figure 2) is tough but not Hamiltonian. 

8. The following graph is tough but not Hamiltonian. 



GRAPH INVARIANTS AND ISOMORPHISM 


Deciding whether two graph descriptions actually specify structurally identical graphs 
is called isomorphism testing. Polynomial-time algorithms for isomorphism testing are 
known only for certain special classes of graphs. However, there are heuristic algorithms 
to test isomorphism of reasonable-sized graphs. The related problem of reconstructing 
a graph from its vertex-deleted subgraphs is also still unsettled. 
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8.5.1 ISOMORPHISM INVARIANTS 


Definitions: 

For simple graphs only, a graph isomorphism between two graphs G and H can be 
defined as a bijection /: Vq — > Vh that such that a pair of vertices u, v is adjacent in Vq 
if and only if the image pair f(u), f(v ) is adjacent in Vh- 

In full generality, a graph isomorphism f: G — » H is a pair of bijections fy\ Vq — > Vh 
and Je - Eg — »■ Eh such that for every edge e £ Eg, the endpoints of e are mapped onto 
the endpoints of /^(e). 

Note: Except when confusion will result, the same notation / can be used for both the 
vertex function fy and the edge function /#. 

A digraph isomorphism is an isomorphism of the underlying graphs such that the 
edge correspondence preserves all edge directions. 

Two graphs are isomorphic if there is an isomorphism from one to the other, or 
informally, if their mathematical structures are identical. 

The isomorphism type of a graph [digraph] G is the class of all graphs [digraphs] 
isomorphic to G. 

A graph invariant is a property of graphs such that every two isomorphic graphs have 
the same value with regard to this property. 

A complete set of invariants is a set of graph invariants that distinguishes any graph 
from any different graph, in the sense that no two nonisomorphic graphs have the same 
set of invariant values. 

A vertex invariant is a property of a vertex which is preserved by isomorphism, in 
the following sense: if v is any vertex and / is any isomorphism, then the vertex f(v) 
has the same value as v with regard to the property. 

An automorphism is an isomorphism from a graph to itself. 

The automorphism group Aut(G) of a graph G is the collection of all automorphisms 
of G, with functional composition as the group operation. 


Facts: 

1 . Two graphs G and H are isomorphic if there is a bijection /: Vq —■ ► Vh such that 
for every vertex pair u,v £ Vq the number of edges joining u and v equals the number 
joining their images f(u),f(v) £ Vh- 

2. Graph invariants are used to distinguish between nonisomorphic graphs. 

3. Most graph invariants are either too tedious to compute or not strong enough at 
distinguishing similar but nonisomorphic graphs. 

4. No good complete set of invariants is known, in the sense that each invariant value 
is easily computed and easily compared. 

5. Vertex invariants are often used to organize the vertices of a graph into equivalence 
classes under graph automorphism, in order to discover the automorphism group of the 
graph. 
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6. Graphs have many different kinds of isomorphism invariants, including the following 
invariants: 

• elementary ( ascertainable by counting ): 

number of vertices, 
number of edges, 
sequence of vertex degrees; 

• structural invariants ( concerning connectivity or cycles ): 

cycle rank (§8.4.1), 
girth (§8.4.1), 
connectivity (§8.4.2), 
edge connectivity (§8.4.2); 

• topological invariants (concerning placement on surfaces ): 

genus (§8.8.4), 
crosscap number (§8.8.4), 
crossing number (§8.7.4), 
thickness (§8.7.4); 

• chromatic invariants (concerning colorings): 

chromatic number (§8.6.1), 
edge-chromatic number (§8.6.2), 
chromatic polynomial (§8.6.1); 

• algebraic invariants ( concerning groups or vector spaces): 

eigenvalues (§8.10.1), 
automorphism group (§8.10.2). 


Examples: 

1. The following figure illustrates an isomorphism / of simple graphs. 


4 O- 



2. The following figure illustrates an isomorphism / of nonsimple graphs. 


; a J 

Ll.3 4 

• 21 



f(2) 


f(l) 


rSfla) V 


fCd) 


l(4Q(B) 


f(c) 


3. The following two digraphs are not isomorphic. Even though there are six different 
isomorphisms of their underlying graphs, none of them preserves the direction of all the 
edges. 
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4. The following three graph drawings all look different. The table below shows some 
of their isomorphism invariants, from which it may be concluded that graph B cannot 
be isomorphic to either graph A or graph C, but that A and C might be isomorphic. 




A 

B 

C 

^vertices 

6 

6 

6 

#edges 

9 

9 

9 

degree seq. 

3, 3, 3, 3, 3, 3 

3, 3, 3, 3, 3, 3 

3, 3, 3, 3, 3, 3 

connectivity 

3 

3 

3 

girth 

4 

3 

4 

genus 

1 

0 

1 

chromatic # 

2 

3 

2 


To construct an isomorphism between graphs A and C, assign labels 0, 1, 2, 3, 4, 5 
cyclically to the vertices of A. Then assign labels 0, 2, and 4 to the top three vertices 
of C, and 1, 3, and 5 to the bottom three. 


8.5.2 ISOMORPHISM TESTING 

Various concepts for isomorphism testing have been proposed. Some exploit or refine 
algebraic invariants, such as the automorphism group or the set of eigenvalues. Others 
exploit a decomposition into planar subgraphs or a refinement of a topological invari- 
ant, such as the average genus. Others are combinatorial and employ enumerative 
techniques, partitioning and the like. Some are a mixture of algebraic, topological, and 
combinatorial approaches. 

Definitions: 

An isomorphism test for graphs is an algorithm that accepts two graphs as input 
and outputs “yes” to indicate the decision that they are isomorphic or “no” to indicate 
that they are nonisomorphic. Unless the context explicitly mentions the possibility of 
error, it is implicitly understood that the decision is correct. 

The eigenvalues of a graph are the eigenvalues of its adjacency matrix. (See §11.9 
and §6.5.) 

The average genus of a graph is the average genus of the imbedding surface, taken 
over all cellular imbeddings of that graph. (See §8.8.3.) 

An equitable partition for a graph G is a partition V\, ... ,V n of its vertex set and 
a set of numbers {dij | 1 < i,j < n} such that every vertex in V) is adjacent to 
exactly dij vertices in V r 

A devil’s pair for an isomorphism-testing approach is a pair of nonisomorphic graphs 
that the approach fails to distinguish. 

A probabilistic isomorphism test is an isomorphism test such that no matter what 
pair of graphs is supplied as input, there is probability 1.0 of a correct decision. 
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Facts: 

1. No polynomial-time isomorphism testing algorithm is known. Moreover, it is not 
known whether isomorphism testing is an NP-complete problem. 

2 . On an n- vertex simple graph, the naive graph isomorphism-testing algorithm below 
has worst-case time 0(n 2 - n!). 

3 . It is easy to design an algorithm decide correctly with probability 1.0 whether two 
randomly selected graphs are isomorphic: they aren’t. With probability 1.0, two ran- 
domly selected graphs will not have the same number of vertices or edges. This observa- 
tion explains why the concept of probabilistic isomorphism testing is defined so that it 
must be able to decide correctly with probability 1.0 for all pairs, not just for randomly 
selected pairs. 

4 . If it were possible to quickly calculate the size of the automorphism group of a graph, 
such an algorithm could be a subprogram of a quick test for isomorphism of graph pairs, 
as follows: 

if \Aut(G) \ ^ \Aut{H)\ then G and H are not isomorphic 
else 

if \Aut(GU H ) | = 2\Aut(G)\ 2 then G and H are isomorphic 
else G and H are not isomorphic 

5 . Another algebraic approach to isomorphism testing is based on eigenvalues. A devil’s 
pair for simply comparing eigenvalues appears in Example 4. 

6. One topological approach to isomorphism testing dissects each graph into planar 
components (§8.7) and combines known efficient tests for isomorphism of planar graphs 
with careful study of possible interconnections. 

7 . Another topological approach to isomorphism testing is based on the genus dis- 
tribution (§8.8.3), taken over all cellular imbeddings. Although calculating the genus 
distribution by brute force would be tedious, one can estimate it by random sampling. 
Any pair of trees is a trivial devil’s pair, but trees are easily tested by another isomor- 
phism algorithm. 

8. The best known practical isomorphism algorithm is “NAUTY” (an acronym for No 
AUTomorphisms, Yes?) by B.D. McKay. This backtrack algorithm repeatedly refines 
an initial vertex partition. At each stage of the refinement, a part of size greater than 1 
is broken into two parts, one a single vertex, and the coarsest equitable partition is 
found. The discrete partitions generated in this way correspond to labelings of the 
graph, organized so as to determine the automorphism group. A complicated scheme is 
used to pick one of these labelings as the “canonical” one used for isomorphism testing. 
This algorithm can very quickly check isomorphism for most graphs, although it has no 
good theoretical bound. 

9 . To certify that two graphs are isomorphic, one can give a vertex bijection that 
realizes the isomorphism. Deciding whether a bijection between the vertex sets of two 
graphs is the vertex function of a graph isomorphism can be achieved in polynomial 
time. 

10 . Graph isomorphism is in NP, by the previous fact, but is not known to be in P. 
The computational complexity of the problem of deciding whether or not two graphs 
are isomorphic is unknown. 

11 . If graph isomorphism is NP-complete, then the complexity hierarchy between P 
and NP collapses, which is considered unlikely. 
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Algorithm 1 : Naive graph isomorphism-testing algorithm. 

input: simple graphs G,H 

if Vg| ^ \Vh\ then print “NO” and stop 
if \Eq\ ^ \Eh\ then print “NO” and stop 
for each bijection /: Vq — > Vh 
for each pair u,v £ Vg 

if u,v adjacent and not adjacent then print “NO” and stop 

if u,v not adjacent and f(u),f(v) adjacent then print “NO” and stop 
print “YES” and stop 


12. The following table shows the best known time bounds for checking isomorphism 
in various classes of graphs. Almost all of these bounds have been achieved using an 
algebraic approach. 


class of graphs (on n vertices) 

time bound 

graphs 

trees 

planar graphs 
graphs of genus g 
cubic graphs 

graphs with max degree < d 

tournaments 

exp y/cn log n 
0(n) 

0(n) 

n°(9) 

0(n 3 log n) 
n °(d) 

n O(logn) 


13. Algorithm 1 gives a naive method for testing whether or not two graphs are iso- 
morphic. 


Examples: 


1. The labeling of these two isomorphic graphs indicates the correspondence between 
vertices. This is the famous Petersen graph. 
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3 . Another devil’s pair for isomorphism testing by degree sequence is 



4 . The following graphs are a devil’s pair for isomorphism testing by simple comparison 
of eigenvalues. They both have characteristic polynomial A 6 — 7A 4 — 4A 3 + 7A 2 + 4A — 1. 
Yet they cannot be isomorphic because their degree sequences are different. 




5 . A 3-connectecl devil’s pair for simply comparing average genus is shown below. 
Both graphs shown have the genus distribution 8, 536, 3416, 1224, that is, 8 imbeddings 
of genus 0, 536 of genus 1, 3416 of genus 2 and 1224 of genus 3. 



8.5.3 GRAPH RECONSTRUCTION 

The question of whether a graph is reconstructible from its subgraphs is one of the most 
beguiling unsolved problems in graph theory. 

Definitions: 

A vertex-deleted subgraph of a graph G is a subgraph G — v obtained by removing 
a single vertex v and all of its incident edges. 

An edge-deleted subgraph of a graph G is a subgraph G — e obtained by removing 
a single edge e. 

The vert ex- deleted subgraph collection of a graph G is the multi-set of all vertex- 
deleted subgraphs G — v. The number of times a graph appears in the collection equals 
the number of different vertices whose removal yields that graph. Thus, the cardinality 
of the collection equals the number of vertices of G. 

The edge-deleted subgraph collection of a graph G is the multi-set of all edge- 
deleted subgraphs G — e. The number of times a graph appears in the collection equals 
the number of different edges whose removal yields that graph. Thus, the cardinality of 
the collection equals the number of edges of G. 

A reconstructible graph is a graph G such that no other graph has the same vertex- 
deleted subgraph collection as G. 

An edge-reconstructible graph is a graph G such that no other graph has the same 
edge-deleted subgraph collection as G. 

A reconstructible invariant is a graph invariant such that all graphs with the same 
vertex-deleted subgraph collection have the same value with respect to this invariant. 

An edge-reconstructible invariant is a graph invariant such that all graphs with 
the same edge-deleted subgraph collection have the same value with respect to this 
invariant. 
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Conjectures: 

1. The graph reconstruction conjecture (P. Kelly and S.Ulam, 1941): Every graph 
with more than two vertices is reconstructible. 

2. The edge reconstruction conjecture: Every graph with at least four edges is edge 
reconstructible. 

3. Halin’s conjecture: If two (possibly infinite) graphs with more than two vertices 
have the same vertex-deleted subgraph collection, then each graph is a subgraph of the 
other. 


Facts: 

1. The graph reconstruction conjecture implies the edge reconstruction conjecture, and 
both are implied by Halin’s conjecture. 

2. The graph reconstruction conjecture does not hold for graphs on two vertices, be- 
cause K 2 and K 2 have identical sets of deleted subgraphs. 

3. The edge reconstruction conjecture does not hold for graphs on four edges, because 
K 3 + Ki (disjoint union) and K \ ,3 have identical collections of edge-deleted subgraphs. 

4. Computer search has verified the reconstruction conjecture for graphs with nine or 
fewer vertices. 

5. The following table lists some invariants and types of graphs which are known to be 
reconstructible. 


both edge-reconstructible and 
invariants 

' reconstructible 
graphs 

other edge-reconstructible 
graphs 

number of vertices 
number of edges 
degree sequence 
connectivity 

characteristic polynomial 

regular 

disconnected 

trees 

outerplanar 

cacti 

more edges than non-edges 
only two vertex degrees 
no induced Ad , 3 subgraph 
large with Hamilton path 
21 og 2 (2 max deg) < avg deg 


6 . If graph F has fewer vertices than graph G then the number of subgraphs of G 
isomorphic to F is reconstructible. 

7. The reconstruction conjecture is not true for directed graphs in general, because 
nonreconstructible tournaments of arbitrarily large size are known. 

8 . Infinite graphs are not reconstructible in general, but Halin’s conjecture holds for 
all known nonreconstructible infinite pairs. 

9. Almost every graph is uniquely determined by any three vertex-deleted subgraphs. 

Example: 

1. The following figure shows a graph (at the left) and its collection of vertex-deleted 
subgraphs. 
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8.6 GRAPH AND MAP COLORING 


The vertex set of a simple graph can be colored so that adjacent vertices are colored 
differently. Similarly, the edges of a graph without self-loops can be colored so that 
adjacent edges are colored differently. If a graph is imbedded in a surface so that there 
are no self-adjacent regions, then the regions can be colored so that adjacent regions 
receive different colors. These entertaining concepts have many important applications, 
including assignment and scheduling problems. 


8.6.1 VERTEX COLORINGS 
Definitions: 

A (proper) vertex k-coloring (or k-coloring) of a simple graph G is a function 
/: Vq — > {l,...,jfe} such that adjacent vertices are assigned different numbers. Quite 
often, the set {1, . . . , k} is regarded as a set of colors. 

A coloring of a graph is a fc-coloring for some integer k. 

An improper coloring of a graph permits two adjacent vertices to be colored the 
same. 

A graph is k- vertex colorable (or k-colorable) if it has a vertex A'-coloring. 

The vertex chromatic number or ( chromatic number) x(G) (or xv(G)) of a 
graph G is the minimum number k such that G is fc-vertex colorable; that is, %(G) 
is the smallest number of colors needed to color the vertices of G so that no adjacent 
vertices have the same color. 

A graph G is k-chromatic if y(G) = k. 

A graph G is chromatically k-critical if G is fc-chromatic and if y(G — e) = k — 1 for 
each edge of G. 

An obstruction to (or for ) k-coloring is a chromatically (fc + l)-critical graph, when 
that graph is regarded as a subgraph of other graphs, and thereby prevents them from 
having chromatic number k. 

A ( complete ) obstruction set for k-coloring is a set of chromatically (/.: + 1 (-critical 
graphs such that every graph that is not fc-colorable contains at least one of them as a 
subgraph. 

An elementary contraction of a simple graph G on the edge e, denoted G | e (or 
G • e), is obtained by replacing the edge e and its two endpoints by one vertex adjacent 
to all the other vertices to which the endpoints were adjacent. 

A graph G is ( combinatorially ) contractible to a subgraph HUH can be obtained 
from G by a sequence of elementary contractions. (The modifier “combinatorially” 
distinguishes this kind of contractibility used for graph colorings from topological con- 
tractibility.) 

The chromatic polynomial of the graph G is the function TTc(t) whose value at the 
integer t is the number of different functions Vq —■ * {1, . . . , t} that are proper colorings 
of G. 
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Algorithm 1: Greedy coloring algorithm. 

input: a graph G with vertex list V\, V2, ■ ■ ■ , v n 

c := 0 {Initialize color at “color 0”.} 
while some vertex still has no color 

c := c + 1 {Get the next unused color.} 

for 1 := 1 to n {Assign the new color to as many vertices as possible.} 

if Vi is uncolored and no neighbor of u* has color c then assign color c to V{ 


Facts: 

1 . A direct way to calculate the chromatic number of a reasonably small graph is in two 
steps. First derive an upper bound for the number of colors needed, either by finding a 
coloring by trial and error or by using the greedy coloring algorithm. Then prove that 
one fewer colors would be insufficient. This could be achieved by an exponential-time 
exhaustion algorithm, or by finding an insightful proof for the particular graph. 

2. Alternative notation: In a topological context, where x(S') means Euler character- 
istic (§8.8), cr(G) can be used for the chromatic number of a graph. 

3 . Unlike a topological contraction along an edge, this operation of “elementary con- 
traction” of two vertices of a simple graph always yields a simple graph. 

4 . x(G) = 1 if and only if the graph G is edgeless. 

5 . x(G) = 2 if and only if the graph is bipartite and its edgeset is nonempty. 

6. The four color theorem (Appel and Haken, 1976): If G is planar, then x(G) < 4. 
That is, every planar graph has a proper coloring of its vertices with 4 or fewer colors. 

7 . {K 2 } is a complete obstruction set for 1-coloring. 

8. The set of odd cycles is a complete obstruction set to 2-coloring. 

9 . The odd wheels W' 2 n +i , n > 1, are obstructions to 3-coloring. 

10 . Brooks’ theorem: If G is a connected graph which is neither an odd cycle nor a 
complete graph, then x(G) < A max (G), where A max denotes maximum degree. (The 
subscript “max” is often omitted.) 

11- X(G) — 1 + A max (G). 

12 . X(G) < 1 + max(5 m i n (G / ), where <5 m i n denotes minimum degree, and where the 
maximum is taken over all induced subgraphs G' of G. 

13 . x{G) < diam{G), where the diameter diam(G) is the length of a longest path in G. 

14 . Hadwiger’s conjecture: If G is a connected graph with x(G) = n, then G is 
contractible to K n \ it is known to be true for n < 5. 

15 . Nordhaus-Gaddum inequalities: If G is a graph with |U(G)| = p and G is its 
edge-complement, then 

• 2 y/p < x(G) + x(G) <P + i; 

• P < X(G) ' X(G) < ( E f 1 ) 2 . 

16 . The greedy coloring algorithm (Algorithm 1) produces a vertex coloring of a graph 
G, whose vertices are ordered. (It is called “greedy” because once a color is assigned, it 
is never changed.) The number of colors it assigns depends on the vertex ordering, and 
it is not necessarily the minimum possible. 

17 . At least one ordering of the vertices of a graph G yields x(G) under the greedy 
algorithm. 
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18. The number of colors used by the greedy coloring algorithm depends on the order- 
ing in which the vertices of G are listed. At least one of the orderings of the vertices 
of G yields X (G). 

19. The number of colors used by the greedy coloring algorithm can exceed x(G) by 
an arbitrarily large number. 

20. There is no known polynomial-time algorithm for finding x(G) exactly. Deciding 
whether a graph has a particular chromatic number is NP-complete, if that number is 
at least 3. 

21. The following table gives the chromatic numbers and edge-chromatic numbers 
(§8.6.2) of the graphs in some common families. 


graph G 

X(G) 

xAG) 

path graph P n , n > 3 

2 

2 

cycle graph C n , n even, n >2 

2 

2 

cycle graph C n , n odd, n > 3 

3 

3 

wheel W n , n even, n > 4 

3 

n 

wheel W n , n odd, n > 3 

4 

n 

complete graph K n , n even, n > 2 

n 

n — 1 

complete graph K n , n odd, n > 3 

n 

n 

complete bipartite graph K m , n , to, n > 1 

2 

max{m, n} 

bipartite G, at least one edge 

2 

Amax(G) 

Petersen graph 

3 

4 

complete /c-partite rrii > 1 

k 

max{i«i, . . . , TOfc} 


22. For every edge e of a simple graph G, tt c(t) = 7 TG- e (t) — ^G-e(t)- 

23. The chromatic polynomial 7r c(t) of a graph with n > 1 vertices and to edges is a 
polynomial in t of degree n, whose leading term is t n , whose next term is — ?nf n_ , and 
whose constant term is 0. 

24. The following table gives the chromatic polynomials of some graphs. 


graph 


n- vertex tree 

cycle graph C n 
wheel W n 
complete graph K n 

t(t- l) 11 " 1 

(t — l) n + (— l) n (t — 1) 
t(t- 2) n ~ 1 + 2) 

t— = t(t — l)(t — 2) . . . (f — n + 1) 


Examples: 

1. Time scheduling : Let classes at a school be modeled by the vertices of a simple 
graph G, with two vertices adjacent if and only if there is at least one student in both 
of the corresponding classes. Then x(G) gives the minimum number of time periods for 
scheduling the classes so as to accommodate all the students. 
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2 . Assignment of radio frequencies: If the vertices of a graph G represent radio stations, 
with two stations adjacent precisely when their broadcast areas overlap, then x(G) 
determines the minimum number of transmission frequencies required to avoid broadcast 
interference. 

3 . Separating combustible chemical combinations: Let the vertices of graph G repre- 
sent different kinds of chemicals needed in some manufacturing process. An edge joins 
each pair of chemicals that might explode if they are combined. The chromatic number 
of this graph is the number of different storage areas required so that no two chemicals 
that mix explosively are stored together. 

4 . Proceeding in the direct way, as described in Fact 1, to color the graph in the 
following figure quickly yields its chromatic number. Applying the greedy coloring 
algorithm, with the vertices considered in cyclic order around the 8-cycle, yields a 3- 
coloring. Since this graph contains an odd cycle (a 5-cycle), it cannot be 2-colored. 
Thus, X = 3. 



5 . In the following figure vertex colorings are indicated for the cycle graphs C^C© 
and C5; in each case three colors are used. Note that xfC's) = x(Cs) = 3, whereas 
X(C 4 ) = 2 (since the vertex colored “3” could have been colored “1”). 



6 . The following figure shows three chromatically 4-critical graphs. 



7 . A 3-coloring of graph A in the figure of Example 6 would necessarily give some color 
to three different vertices. Two of these vertices would have to be adjacent (because 
the edge-complement contains no 3-cycle). Thus, a 3-coloring could not be proper, and 
hence x = 4. 

8. A 3-coloring of graph B in the figure of Example 6 would need three different colors 
on the outer 5-cycle. These would force the use of three different colors on the points 
of the central 5-star. This would force the use of a fourth color on the central vertex. 
Thus, x = 4. 
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8.6.2 EDGE COLORINGS 


Definitions: 

An edge coloring of a graph is an assignment of colors to its edges such that adjacent 
edges receive different colors. 

A graph G is k-edge colorable if there is an edge coloring of G using at most k colors. 

The edge chromatic number Xi(G) (or xe(G)) of a graph G is the minimum k such 
that G is k-edge colorable. If \i (G) = k, then G is edge k-chromatic. 

Chromatic index is a synonym for edge chromatic number. 

A graph is edge-chromatically k-critical if it is edge fc-cliromatic and y, (G — e) = 
Xi ( G ) — 1 for every edge e of G. 

For a graph G, the line graph L[G ) has as vertices the edges of G, with two vertices 
adjacent in L(G) if and only if the corresponding edges are adjacent in G. 

Facts: 

1. Alternative notation: In a topological context where x is used for Euler character- 
istic, the notation for edge-chromatic number is often ecr(G). 

2. Every edge coloring of a graph G can be interpreted as a vertex coloring of the 
associated line graph L(G). Thus, Xi(G) = x(K(G)). 

3. A max (G) < Xi(G). 

4. Vizing’s theorem: If G is a simple graph, then Xi(G) < A max (G) + 1. 

5. Vizing’s general theorem : If G is a general graph whose maximum edge multiplicity 
is p, then Xi(G) < A max (G) + p. 

6. Either Xi(G) = A max (G) (G is of class one ) or Xi(G) = A max (G) + 1 (G is of class 
two). 

7. Xi {K m ,n) = x( L (Km,n)) = x{ K m x K n ) = ma x{m,n}, if m,n > 1. 

8. If G is bipartite, then Xi(G) = A max (G). 

9. Xi (Kn) = n if n is odd (n/ 1); Xi (Kn) = n — 1 if n is even. 

10. If G is planar and A max (G) > 8, then Xi(G) = A max (G). 

11. If G is 3-regular and Hamiltonian, then Xi(G) = A max (G). 

12. If G is regular with \Vq\ odd and \Eq\ > 0, then Xi(G) = A max (G) + 1. 

13. The greedy edge-coloring algorithm (Algorithm 2) produces an edge-coloring of a 
graph G, whose vertices are ordered. The number of colors it assigns depends on the 
vertex ordering, and it is not necessarily the minimum possible. (It is equivalent to 
applying the greedy vertex-coloring algorithm to the line graph.) 

Examples: 

1. The following three graphs are all edge 3-chromatic. None of them is edge-chromati- 
cally 3-critical. Since each graph has a vertex of degree three, no 2-edge-coloring is 
possible. 
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Algorithm 2: Greedy edge-coloring algorithm. 

input: a graph G with edge list e\,e 2 , ■ ■ ■ ,e n 

c := 0 {Initialize color at “color 0”.} 
while some edge still has no color 

c := c + 1 {Get the next unused color.} 
for i := 1 to n 

{Assign the new color to as many edges as possible.} 

if is uncolored and no neighbor of e* has color c then assign color c to e* 


2. The following graph is 5-edge-chromatic. Since there are 14 edges, a 4-edge-coloring 
would have to give the same color to four of them. For this edge-coloring to be proper, 
these four edges would have to have no endpoints in common. That is impossible, 
because the graph has only seven vertices. 



3. The Petersen graph is edge-chromatically 4-critical. 

4. Exam scheduling: Suppose that each student at a university is to be examined 
orally by each of his or her professors at the end of the term. Then the minimum 
number of examination periods required is the edge chromatic number of the bipartite 
graph with vertices representing students and professors, and edges connecting students 
with their professors. 

5. Wiring electrical network boards: A number of relays, switches, and other elec- 
tronic devices D±, D 2l • • • , D n on a relay panel are to be connected into a network. 
The connecting wires are twisted into a cable, with those connected to D± emerging at 
one point, those connected to D 2 at another, and so forth. The wires emerging from 
the same point must be colored differently, so that they can be distinguished. The 
least number of colors required to color the wires is the edge chromatic number of the 
associated network. 

6. The following nonsimple graph illustrates Vizing’s general theorem. Its highest edge 
multiplicity is 3, its maximum degree is 6, and its edge chromatic number is 9. 
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8.6.3 CLIQUES AND INDEPENDENCE 


Definitions: 

A clique of a graph G is a complete subgraph of G which is contained in no larger 
complete subgraph of G. 

The clique number u(G) of a graph G is the order (i.e., number of vertices) of a 
largest clique of G. 

A subset W of V(G) (or D of E(G)) is independent if no two elements of W (respec- 
tively D) are adjacent. 

The vertex independence number a(G) of G is the order of a largest independent 
set of vertices in G. 

The edge independence number cm (G) of a graph G is the size of a largest inde- 
pendent set of edges in G. 

A graph G is perfect if, for every induced subgraph H of G, the chromatic number 
equals the clique number, that is, x(H) = w( H ). 

A graph G is weakly 7 -perfect if x (G)=w(G). 


Facts: 

1. The independence number of a graph is equal to the clique number of its edge- 
complement, and vice versa. That is, a(G ) = ui(G) and u>(G) = a(G). 

2. The chromatic number of a graph is at least as large as the clique number: x(G) > 
u;(G). 

3. For each positive integer n, there is a graph G with chromatic number n and clique 
number equal to 2; that is, G contains no triangles. 

4. If no induced subgraph of a graph is isomorphic to P4, then its chromatic number 
equals its clique number and the greedy algorithm (§8.6.1 Algorithm 1) always produces 
a coloring with the minimum number of colors. 

5. Lovasz’s perfect graph theorem : A graph G is perfect if and only if its edge- 

complement G is perfect. 

6. ffii< X (G)<|V(G)| + l-a(G). 

a(G) 

7. If |A(G)| > A max (G) x aq(G), then Xl (G) = A max (G) + 1. 


Examples: 


1. The following graph has three cliques — of sizes 2, 3, and 4. Thus, its clique number 
is 4. 



2. If 1 < m < n, then u>(K mtn ) = 2 ,a{K m>n ) = n, and cm (K m>n ) = m. 

3. Define AT„( m ) to be the graph whose edge complement is nK m , the disjoint union 
of n copies of I< m . Then w(A"„ (m )) = n,a{K n ^ m )) = to, and ai(A'„( m )) = 
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8.6.4 MAP COLORINGS 


Definitions: 

An orientable surface S' is a surface homeomorphic to a sphere with g > 0 handles 
attached and is denoted by S g . 

A nonorientable surface S is a surface homeomorphic to a sphere with k > 1 crosscaps 
attached and is denoted by A© (See §11.7.1.) 

The Euler characteristic of a surface S is x(S), where 

, _ J 2 — 2g if S is homeomorphic to S g 

' (2 — k if S is homeomorphic to A© 

The most usual notation for Euler characteristic throughout mathematics is x(S). How- 
ever, ad hoc notation such as eu(S) is sometimes used in chromatic graph theory. 

A map on a surface is an imbedding of a graph on that surface. (See §8.7.) 

A map coloring is an assignment of colors to the regions of a map so that adjacent 
regions (those sharing a one-dimensional boundary portion) receive different colors. 

A map M is n-colorable if there is a map coloring of M using at most n colors. 

The chromatic number x(Af) (or cr{M) or xr(M)) of a map M is the minimum n 
such that M is n-colorable. 

The chromatic number x(S0 ( or cr(.S')) of a surface S is the largest chromatic num- 
ber x(Af) for all maps M on S. 

The ( empire ) chromatic number x(S, c) for a surface S is the largest xi. M) for all 
maps M on S, where now a country has at most c > 1 components (regions) and all 
components of a fixed country are colored alike, but adjacent components of different 
countries must receive different colors. (Thus x(<5) = x(^>l)-) 


Facts: 


1. A region coloring can be regarded as a vertex coloring of the dual graph (§11.7). 
From this perspective, x is the largest value of x(G) for all graphs G imbeddable on S. 

2. By stereographic projection (§11.6.5), x('S'o) gives the chromatic number of the 
plane. 


3 . Let G be a planar cubic block; then Xi(G) = 3. 

4 . Let M be a plane map whose graph G is connected and bridgeless. Then x(Af) = 2 
if and only if G is Eulerian. 

5 . Let M be a plane map for a cubic connected bridgeless graph G; then xi,M) = 3 if 
and only if the dual graph is Eulerian. 


6. If G is a plane graph without triangles, then x(G) = 3. (Grotzsch, 1958) 

7 . The four color theorem (Appel and Haken, 1976): x(^o) = 4. That is, every map 
on a sphere or plane can be colored with 4 or fewer colors. 

8. The Heawood map coloring theorem (Ringel and Youngs, 1968): For g > 0, 

, , | 7 + y/1 + 48 g 

x(s g ) = 
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9 . The nonorientable Heawood map coloring theorem (Ringel, 1954): For k > 0, 


except that x(Ar 2 ) = 6. 


x(N k ) 


7 + Vl + 24fc 
2 


10. xi s,c)< fe+ i + ^( fc ;i) 2 - 24 gg) . 

11. x('S'oi c) = 6c for c > 2. 

12. x(-^ii c ) = 6c for c > 1. 

13 . x('S'i) c) = 6c + 1 for c > 1. 

14 . History of the four color problem: In 1852, Francis Guthrie first asked whether 
four colors suffice to color every planar map, and his brother Frederick communicated 
the question to Augustus De Morgan. Arthur Cayley in 1878 was first to mention the 
problem in print. In 1879, A. B.Kempe, a London barrister, published a “proof” of 
the four color conjecture: every planar map is 4-colorable. In 1890 Percy Heawood 
(1861-1955) found an error in Kempe’s argument. A correct proof was established by 
Kenneth Appel and Wolfgang Haken in 1977. 


15. Concepts in the Haken- Appel proof of the four color theorem: Appel and Haken 
found an “unavoidable” set with 1476 graphs, which means that at least one of these 
graphs must be a subgraph of any minimum counterexample to the four color conjecture. 
A method called “discharging”, due to Heinrich Heesch, is used to find an unavoidable 
set. Using a computer, they proved that each of these graphs is “reducible”, which 
means that it cannot be a subgraph of a minimum counterexample. 

16 . A simplified proof of the four color theorem can be found in [Th98] or at the 
following Web site: 

http : //www.math.gatech. edu/~thomas/FC/ftpinfo .html 


17 . History of the Heawood map coloring problem: In 1890, Percy Heawood estab- 
lished the upper bound 


X(M) < 


7 + ©49 - 24ew(S’) 
2 


for X (M) for all maps M on all closed surfaces other than Sq. Heawood showed that 
his bound was exact for the torus, by the example of the dual of Kj on Si, and he 
asserted without proof that similar “verification figures” existed for all other cases. In 
1934, Philip Franklin showed that Heawood’s assertion was wrong for N 2 (the “Klein 
bottle”). For all other nonorientable surfaces, Gerhard Ringel provided the necessary 
figures in 1952. In 1968, Ringel and J.W.T. Youngs completed the verification for all 
orient able surfaces other than the sphere. 


Examples: 

1. Let M be the tetrahedral map, i.e., an imbedding of A'4 in So- By Fact 4, x(Af) 7^ 2, 
since K 4 is not Eulerian. By Fact 5, x(Af) 7^ 3, since the dual graph (isomorphic to K 4 ) 
is also not Eulerian. Thus, x(Af) = 4 = x(So)- 

2. Cartography: If countries on Earth are allowed two components, but no more, then 
by Fact 11 a map might require twelve colors, but no more. 
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3. By Fact 8, x(S'i) = 7. The dual map of the following figure imbeds K-j in the torus. 
To obtain the torus, paste the left side of the rectangular sheet directly to the right 
side, and then paste the top to the bottom with a | twist. 



8.6.5 GRAPH MULTICOLORING 

This subsection deals with the proper coloring of the vertices of a graph where each 
vertex is assigned more than one color (or label). Multicoloring of graphs has many 
applications to assignment and scheduling problems. See [MiRo91]. 

Definitions: 

A (proper) k-tuple coloring (or k-multicoloring ) of a graph is an assignment of 
a set of k distinct colors to each vertex of a graph so that whenever two vertices are 
adjacent, their sets of assigned colors are disjoint. 

A (proper) multicoloring of a graph is a fc-multicoloring for some k. 

The k-tuple chromatic number Xfc(G) of a graph G is the smallest number of colors 
such that G has a fc-tuple coloring. 

A ( proper ) set coloring of a graph is an assignment of a set of colors to each vertex 
of G such that whenever two vertices are adjacent, the sets of colors assigned to the two 
vertices are disjoint. Note : The sets can have different sizes. 

Facts: 

1. Xk(G)<k X (G). 

2. If the clique number w(G) (§8.6.3) of G is equal to x(G) (i.e., G is weakly y-perfect), 
then Xfc(G) = kx(G). 

3. Set colorings generalize multicolorings since the sets of assigned colors in a set col- 
oring can have different sizes. 

4. Every set coloring of G where all sets assigned to the vertices are all k- sets is a 
fc-tuple coloring of G. 

5- Xk{K n ) = nk. 

6. If G is bipartite (with at least one edge), then Xfc(G) = 2k. 

Examples: 

1. Multiple channel assignment : Several cities each need to have four broadcast fre- 
quencies assigned to them (a generalization of §8.6.1, Example 7). The 4-tuple chro- 
matic number x<i(G) is the minimum number of frequencies needed so that there is no 
broadcast interference. 
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2 . X 2 {C$) = 5, as illustrated in the following figure. 



3 . Exam scheduling: Each final exam at a school is given in two parts, with each 
part requiring one final exam period. If a graph G is constructed by using the courses 
as vertices, with an edge joining v and w if there is a student taking courses v and w, 
then \2 (G) gives the minimum number of periods required to schedule all the exams so 
no student has a conflict. 

4 . Suppose that in the previous example the number of periods required for the final 
exams in courses varies. The problem of scheduling the final exams in the fewest number 
of periods is a set coloring problem. 


8.7 PLANAR DRAWINGS 


Planarity is an important consideration in physical networks of any kind, because it 
is usually less expensive to produce a planar network. For instance, overpasses are a 
costly feature in highway design. Also, it is less complicated to manufacture a planar 
electrical network than a nonplanar network. 


8.7.1 CHARACTERIZING PLANAR GRAPHS 

A graph cannot be drawn without edge-crossings in the plane if it “contains” either the 
graph K 5 or the graph K 33 . Conversely, every graph that “contains” neither of those 
two graphs can be drawn without crossings. 

Definitions: 

A graph imbedding (or embedding) is a drawing with no crossings at all. 

A graph is planar if it has an imbedding in the plane. 

A graph is nonplanar if no imbedding in the plane is possible. 

A drawing of a graph is normalized if there are no crossings, or if each crossing is a 
point where the interior of one edge crosses the interior of one other edge. (Edges may 
be drawn either straight or with curves.) 

The graphs K-j and K 3 3 are called the Kuratowski graphs , after the Polish mathe- 
matician Kazimierz Kuratowski (1896-1980). 


© 2000 by CRC Press LLC 



Facts: 


1 . The graphs K 5 and A3 3 are both nonplanar. See Examples 4 and 5 for proofs that 
they are not planar. 

2 . Kuratowski planarity theorem : A graph is planar if and only if it has no subgraph 
homeomorphic (§ 8 . 1 . 2 ) to K 5 or to AT 3i3 . 

Examples: 

1 . The drawings of Q 3, AT 5 , and AT 3 3 in the following figure all have crossings. However, 
the graph Q 3 is planar, because it can be redrawn without any crossings. 



2 . The drawings of Q 3 and K§ in the figure of Example 1 are normalized, but the 
drawing of A3 3 is not normalized, because three lines go through the same point. 

3 . The Petersen graph (§ 8 . 1 ) does not contain AT 3i3 itself as a subgraph. However, if 
the two edges depicted by broken lines in the following figure are discarded, then the 
resulting graph is homeomorphic to A3 3, so the Petersen graph is not planar. 



4 . Nonplanarity of Ky. To draw the complete graph on the vertices V\,V2, v 3 , V4, v 3 in 
the plane, one might as well start by drawing the 4 -cycle V\, V2, v 3 , 1)4, which separates 
the plane. Next draw the edges between V\ and v 3 and between V2 and V4. To avoid 
crossing each other, one of these edges must go inside the 4 -cycle and the other outside, 
as shown in the following figure. The net result so far is that there are four 3 -sidecl 
regions, each with three vertices on its boundary. Thus, no matter which region is to 
contain the vertex v 3 , that vertex cannot be joined to more than three other vertices 
without crossing the boundary. 



5 . Nonplanarity of K 3<3 : To form a planar drawing of the complete bipartite graph 
on the parts {t>i, 113,^5} and {V2, V4,Vq}, one might as well start by drawing the 6-cycle 
v±, V2, v 3 , V4, 1)5, vq, which separates the plane. Next draw the edges between v% and V4 
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and between i >2 and v§. To avoid crossing each other, one of these edges must go inside 
the 6-cycle and the other outside. The net result so far is shown in the following figure. 
It is now clear that V 3 and vq cannot be joined without crossing some other edge. 



6. Civil engineering: Suppose that a number of towns are to be joined by a network 
of highways. If the network is planar, then the cost of bridges for underpasses and 
overpasses can be avoided. 

7. Electrical networks: A planar electrical network with bare wires joining the nodes 
can be placed directly onto a flat board. Otherwise, insulation would be needed to 
prevent short circuits at wire crossings. 


8.7.2 NUMERICAL PLANARITY CRITERIA 

Certain numerical relationships are true of all planar graphs. One way to show that a 
graph is nonplanar is to show that it does not satisfy one of these relations. 

Definitions: 

A region of an imbedded graph is, informally, a piece of what results when the surface 
is cut open along all the edges. From a formal topological viewpoint, it is a maximal 
subsurface containing no vertex and no part of any edge of the graph. 

The boundary of a region R of an imbedded graph is the subgraph containing all 
vertices and edges incident on R. It is denoted dR. 

A face of an imbedded graph is a region plus its boundary. 

The exterior region of a planar graph drawing is the region that extends to infinity. 

The girth of a graph is the number of edges in a shortest cycle. The girth is undefined 
if the graph has no cycles. 

Facts: 

1. Euler polyhedral equation: Let G = (V, E) be a connected graph imbedded in the 
plane with face set F. Then |Vj — \E\ + |F| = 2. 

2. Edge-face inequality: Let G = (V,E) be a simple, connected graph imbedded in a 
surface with face set F. Then 2\E\ > 3|F|. 

3. Edge- face inequality ( strong version): Let G = (V, E) be a connected graph, but 
not a tree, imbedded in a surface with face set F. Then 2\E\ > girth(G) • |F|. 

4. Let G = (V,E) be a simple, connected graph. If G is planar then 3|Vj — \E\ > 6. 

5. Let G = (V, E ) be a connected graph that is not a tree. If G is planar then 
(\V\ - 2) • girth (G) > \E\ ■ (girth(G) - 2). 

6. Let G = (V, E) be a simple, connected, bipartite graph that is not a tree. If G is 
planar then \E\ < 2 ■ \ V\ — 4. 
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Examples: 

1. In the planar imbedding of the following figure, \V\ = 4, \E\ = 6, and |F| = 4. Thus, 
| V\ — \E\ + |F| = 4 — 6 + 4 = 2. (The “exterior” region counts as a face.) 



2 . Fact 4 implies that is nonplanar. 

3 . Fact 5 implies that the Petersen graph, whose girth equals 5, is nonplanar. 

4 . Fact 6 implies that A3.3 is nonplanar. 


8.7.3 PLANARITY ALGORITHM 
Definitions: 

A bridge of a subgraph H in a graph G is a maximal connected subgraph of G in 
which no vertex of H has degree greater than 1. 

An attachment of a bridge B of a subgraph H in a graph G is a vertex of B HI II. 
(That is, an attachment is a vertex in which the bridge meets the rest of the graph.) 

Facts: 

1. Call two edges in the complement of a subgraph H of a graph G “related” if they 
are both contained in a path in G that has no vertices of H in its interior. Then the 
bridges of H are the induced subgraphs on the equivalence classes of edges under this 
relation. 

2 . Informally, a bridge is a subgraph obtained from one of the “pieces” that result by 
deleting H from G by reattaching the endpoints to the edges that attach to H. See 
Example 1. 

3 . The time needed to test planarity by searching directly for subdivided copies of K 5 
and A3 3 is an exponential function of the number of vertices. 

4 . J.Hopcroft and R. Tarjan [1974] have developed a planarity-testing algorithm that 
can be executed in time proportional to the number of vertices (“linear time”). 

5 . Algorithm 1 can be implemented to run in time approximately proportional to the 
square of the number of vertices (“quadratic time”). 

6 . None of the linear-time planarity algorithms is easy to describe and implement. 
However, Algorithm 1 is easily implemented, and its running time is satisfactory for 
reasonably large graphs. 

Example: 

1. The following figure shows a subgraph and its three bridges: B\,B 2 , and B$. The 
subgraph H is the dark cycle. The attachments of the bridges are the vertices along 
the dark cycle. 
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Algorithm 1 : Easy planarity-testing for graph G. 

input: a simple, connected graph G 

Go := an arbitrary cycle in G; draw Go in the plane; j := 0 

{Grow a sequence of nested subgraphs Go, Gi, . . . until all of G has been drawn 
in the plane; if this does not happen, then G is nonplanar} 

while Gj ^ G {this possible exit implies G is planar} and (WB G bridges {Gj)) 
{Vv G attachments (B)) (3 region R of Gj in plane) v G dR {this possible 
exit implies G is nonplanar} do 
{While-loop body says how to grow subgraph Gj+ 1 } 
if ( 3B G bridges {Gj)) (Vu G attachments (B)) (3! region R of Gj) [u € dR] 
then {case 1 -a forced move exists} 

select a path P between two attachments of B 
obtain subgraph G J+ i by drawing path P in region R 
else {case 2 — no forced move exists} 

select any bridge, and find two regions for its attachments 
select any path between two attachments of that bridge 
draw that path into either region to obtain Gy+i 

j ■= j + 1 


2. Suppose that the figure in Example 1 occurred in the execution of Algorithm 1. At 
the next iteration of the while-loop body, suppose that bridge B 2 is selected, and suppose 
that a path in B 2 is drawn outside the dark cycle. Then, on the following iteration of 
the while-loop body, bridge B 3 would be a forced choice, and a path from B 3 would 
have to be drawn inside the dark cycle. Eventually, bridge B 1 would have to be drawn 
outside the dark cycle, thereby yielding a planar drawing of the entire graph. 


8.7.4 CROSSING NUMBER AND THICKNESS 
Definitions: 

The crossing number of graph G, denoted ^(G), is the minimum number of edge- 
crossings possible in a normalized drawing of G in the plane. 

The thickness of graph G, denoted 9{G), is the minimum number of planar graphs 
whose union is G. 

Facts: 

1 . v{K n )<\-[ 

2. For all integers n < 10, v{K n ) = \ • [f J • [^rj • L^J ' Lnr J ■ 

3. R. Guy has conjectured that the equation of Fact 2 holds for all positive integers. 

4. v{K m<n ) < [f J • L^J • Lfj • L^J. 

5. For all integers m and n such that min(m, n) < 6, D. Kleitman proved that v{K n ) = 

IfH-fiHfHfij. 
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6. Zarankiewicz’s conjecture: The equation of Fact 5 holds for all positive integers m 
and n. 


7 . 0(K n )> L^J. 

8 . e(Qn)=[^\. 


9 . 


0(G) > 


\E\ 

3|V|-6 


for all simple graphs. 


Examples: 

1. Fact 1 implies that v(K e ) < 3. Thus, it is possible to draw K e with at most three 
crossings. 

2. Computer engineering: Facts 7 and 8 yield lower bounds for the minimum number 
of layers needed for a multi-layer layout of an electronic interconnection network whose 
architecture is a complete graph or a hypercube graph, respectively. 


8.7.5 STEREOGRAPHIC PROJECTION 

Definitions: 

A continuous one-to-one function from one subset of Euclidean space onto another is 
a topological equivalence if its inverse is continuous. (Informally, this means that 
either subset could be reshaped into the other without tearing, but only by compressing, 
stretching, and twisting.) 

The stereographic projection (due to Bernhard Riemann, 1826-1866) adds a single 
point to a plane and thereby closes the “hole at infinity” and converts it into a sphere, 
as follows: 

• start with a sphere in 3-space, tangent at its south pole S to the plane z = 0 at 

the origin (0,0,0), as shown in the following figure; 

• through each point x of the sphere draw a ray from the north pole N, extending 

to the point f{x) at which it meets the plane. 



< 0 , 0 , 0 ) 


Facts: 

1. The correspondence x — > /( x) from the sphere minus its north pole onto the plane is 
a topological equivalence. In other words, the sphere minus a point could be stretched 
apart at the missing point and flattened out so that it covers the plane. 

2. Any planar imbedding can be transformed into an imbedding in the sphere, which 
is a closed surface, by using the inverse of stereographic projection and closing up the 
pinhole. This eliminates the inelegant nuisance of having one “special” region with a 
hole. 
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8.7.6 GEOMETRIC DRAWINGS 

Geometric drawing of graphs is a topic in computational geometry. Unlike ordinarily 
planarity and topological graph theory, its concerns include the exact coordinates in the 
plane of the images of the vertices and the edges. 

Definitions: 

A straight-line drawing of a graph is a drawing in which each edge is represented by 
a single straight line segment. 

An orthogonal drawing of a graph is a drawing in which each edge is represented by 
a chain of horizontal and vertical line segments. 

A polyline drawing of a graph is a drawing in which each edge is represented by a 
polygonal path, that is, by a chain of line segments with arbitrary slope. 

A bend in a polyline drawing is a junction point of two line segments belonging to the 
same edge. 

A grid drawing of a graph is a polyline drawing in which vertices, crossings, and bends 
have integer coordinates. 

The area of a graph drawing is the area of the convex hull of the drawing. 

A distance-ranked partition of a graph G with respect to a nonempty vertex subset S 
has cells Cj for j = 0,1,.... Vertex v is in cell Cj if and only if its shortest path to 
every vertex of S has length j. 

A distance-ranked drawing of a graph G with respect to a nonempty vertex subset S 
has the cells of its distance-ranked partition organized into columns from left to right 
according to ascending distance from S. 

Facts: 

1. Straight-line and orthogonal drawings are special cases of polyline drawings. 

2. Polyline drawings can approximate drawings with curved edges. 

3. Computer programmed systems that support general polyline drawings are more 
complicated than systems that support only straight-line drawings. 

4. Many graph drawing problems involve a trade-off between competing objectives, 
such as the desire to minimize both the area and the number of edge-crossings. 

5. The area required for a planar polyline grid drawing of an n-vertex planar graph 
is 0(n 2 ). 

6. The area required for a planar orthogonal grid drawing of an n-vertex planar graph 
is 0(n 2 ). 

7. The area required for a planar straight line grid drawing of an n-vertex planar graph 
is 0(n 2 ). 

8. Every planar graph of maximum degree 4 has an orthogonal planar drawing whose 
total number of bends is at most 2 n + 2. 

9. Every planar graph of maximum degree 4 has an orthogonal planar drawing such 
that the maximum number of bends in an edge is at most 2. 
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Examples: 

1. The following figure shows a nonplanar straight line drawing of the planar graph 
Kq — 3 K 2 and a planar polyline drawing of that same graph. 




2. The following figure shows two orthogonal grid drawings of the graph Kq — 3 Tv 2 - 
Whereas the lefthand drawing has two edges with three bends, the maximum number 
of bends in any edge of the middle drawing is two. The righthand drawing has the 
smallest total number of bends and the smallest area of the three drawings. 




— 


-0 

— 

1 


: f : 




3. The following figure shows a distance-ranked drawing of the cube graph Q 3 with 
respect to the vertex 000. 


100 110 



000 010 101 111 



8.8 TOPOLOGICAL GRAPH THEORY 


Topological graph theory mainly involves placing graphs on closed surfaces. Special 
emphasis is given to placements that are minimum with respect to some kind of cost 
or that are highly symmetric. Minimization helps to control the cost of manufacturing 
networks, and symmetry facilitates the task of routing information through a network. 


8.8.1 CLOSED SURFACES 

Holes in any surface can be closed off by operations like stereographic projection (§8.7.5). 
This enables topological graph theory to focus on drawings in closed surfaces. 
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Definitions: 

Adding a handle to a surface is accomplished in two steps (illustrated in Example 1): 

• punch two disk- like holes into the surface; 

• reclose the surface by installing a tube that runs from one hole to the other. 

An orientable surface is defined recursively to be either the sphere So, or a surface 
that is obtained from an orientable surface by adding a handle. (See Example 2 for the 
construction.) 

The genus of an orientable surface is the number of handles one must add to the 
sphere to obtain it. Thus, the surface obtained by adding g handles to So has genus g. 
It is denoted S g . 

The torus is the surface Si of genus 1. 

A Mobius band is the surface obtained by pasting the left side of a rectangular sheet 
to the right with a half-twist. A paper ring with a half-twist is a commonplace model 
of the Mobius band. (See Example 3.) (Augustus Ferdinand Mobius, 1790-1868) 

Adding a crosscap to a surface is accomplished by the following two steps: 

• punch one disk-like hole into the surface; 

• reclose the hole by matching its boundary to the boundary of a Mobius band. 

The nonorientable surface Nk is obtained by adding k crosscaps to the sphere. The 
sphere is sometimes regarded as the “surface with crosscap number 0” and denoted No, 
even though it is orientable. (See Example 4.) 

The subscript k is called the crosscap number of the surface A© 

The surfaces iV| and N? are called the projective plane and the Klein bottle, re- 
spectively. (Felix Klein, 1849-1925) 

Facts: 

1. Classification of closed surfaces: Every closed surface is equivalent to exactly one 
of the surfaces S g (g > 0) or Nk (k > 1). 

2. Adding a handle to the nonorientable surface Nk is equivalent to adding two cross- 
caps. That is, the resulting surface is Nk + 2 - 

3. If a loop is drawn around each handle of S g and if these g loops are then cut open, 
the result is a (non-closed) surface that can be stretched and flattened out into a subset 
of the plane. 

4. The subscript g equals the maximum number of closed curves on S g that can be cut 
open without disconnecting that surface. 

5. The subscript k equals the maximum number of closed curves on Nk that can be 
cut open without disconnecting that surface. 

6. No nonorientable surface can be imbedded in 1Z 3 . 

7. Network layouts: The surfaces actually used for computer interconnection network 
layouts and other practical purposes rarely have graceful curved shapes, because among 
other reasons, that would obstruct miniaturization and ease of manufacture. Moreover, 
such surfaces usually have holes. However, the classification theorem and the closing of 
holes reduce the topology of the layout problems to placing graphs on closed surfaces. 
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Examples: 

1. Adding a handle is achieved by punching two holes and connecting them with a 
tube, as illustrated. 



2. To construct the sequence of all orientable surfaces from the sphere So, each succes- 
sive handle is added at the right of the previous surface. 



Sq Si S2 


3. The Mobius band is constructed by giving a half-twist to a rectangular strip and 
then pasting the ends together. 


]-DQ 


4. To construct the sequence of all nonorientable surfaces from the projective plane Ni, 
each successive crosscap is added at the right of the previous surface. 





8.8.2 DRAWING GRAPHS ON SURFACES 
Definitions: 

A flat polygon representation of a surface S' is a drawing of a flat polygon with 
markings to match the sides into pairs such that when the sides are pasted together as 
the markings indicate, the resulting surface S is obtained. (Certain special flat polygon 
representations are called fundamental polygon representations.) (See Example 1.) 

An imbedding (or embedding) of a graph is a drawing with no edge-crossings. 

A face of an imbedding means a region plus its boundary. The set of all faces is 
denoted F. 

The Euler characteristic of an imbedding of a graph G = (V. E) is the number 
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A flat-polygon drawing of a graph on a surface has some graph edges drawn in two 
or more segments, so that one segment runs from one endpoint of an edge to a side of 
the flat polygon and another segment runs from the other endpoint to the corresponding 
position on the matched side of the flat polygon. Sometimes there are also some interior 
edge segments running between polygon sides. (Flat-polygon drawings are best used 
for small graphs.) 

Imbedding modification (or “surgery” ) on a surface means adding handles and cross- 
caps to the surface and then drawing one or more edges that traverse the new handles 
and crosscaps. 

Henri Poincare ( 1854 - 1912 ) introduced a duality construction (see Example 3 ) as 
follows: 

• insert into the interior of each (primal) face / a single dual vertex /*; 

• through each primal edge e draw a dual edge e*; if edge e lies on the intersection 

of two primal faces / and f (possibly / = /'), then the dual edge e* joins the 
dual vertices f* and /'*; 

• the dual graph is the graph G* = ({ /* | / € F }, { e* \ e € E }); 

• the dual imbedding is the resulting imbedding G* —>■ S. 

Facts: 

1. Every closed surface has a flat polygon representation. This makes it possible to 
draw pictures of graph imbeddings in any surface. 

2. Euler polyhedral equation for orientable surfaces : Let G = (V, E ) be a connected 
graph, cellularly imbedded (§ 8 . 8 . 3 ) into the surface S g with face set F. Then 

\V\-\E\ + \F\ = 2-2g = X (S g ). 

3. Euler polyhedral equation for nonorientable surfaces : Let G = (V,E) be a connected 
graph, cellularly imbedded into the surface with face set F. Then 

\V\-\E\ + \F\ = 2-k = X (N k ). 

4. Edge-face inequality: Let G = (V, E ) be a simple, connected graph imbedded in a 
surface with face set F. Then 

2\E\ > 3|F|. 

5. Edge-face inequality, strong version: Let G = (V, E) be a connected graph, but not 
a tree, imbedded in a surface with face set F. Then 

2\E\ > girth(G) • |F|. 


Examples: 

1. Flat polygon representations of the double-torus and the torus are illustrated as 
follows: 
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2. In the imbedding K 5 — » S± illustrated below, edges c and d cross through flat polygon 
sides 2 and 1, respectively. The “outer region” is actually 8-sided, with boundary circuit 
(a, d, 6, c, /, d, e, c). Two pairs of sides of this region are pasted together. The appearance 
of this single region as four subregions at the corners of the flat polygon is a side-effect 
of the particular representation and not a true feature of the imbedding. 


1 ' 


3. The Poincare duality construction is illustrated below. 




8.8.3 COMBINATORIAL REPRESENTATION OF GRAPH IMBEDDINGS 
Definitions: 

The rotation (in “ edge- format ” ) at v is obtained from a flat-polygon drawing of a 
graph by the following sequence of steps: 

• label one end of each edge + and the other end — , or put an arrow on each edge 

so that the head faces the + end; 

• at each vertex, traverse a small circle centered at that vertex, and record the 

cyclically ordered list of edge-ends encountered; this list is the rotation. 

The vertex-format of a rotation is obtained by replacing each edge-end in the edge- 
format by the vertex at the other end of that edge. The vertex format is used only for 
simple graphs. 

A rotation system is a complete list of rotations, that is, one for every vertex. If the 
surface is orientable, it is assumed that the traversals of the small circles around the 
vertices are in a consistent direction, that is, all clockwise or else all counterclockwise. 

An imbedding is cellular (or a “2-cell imbedding”) if every region is planar and has 
connected boundary. 

Facts: 

1. Two cellular imbeddings of a graph are equivalent if and only if they have the same 
rotation system. 

2. If a cellular graph imbedding is represented as a rotation system, then the regions 
can be reconstructed algorithmically. 
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Example: 

1. An imbedding — > Si and both formats of its rotation system. 


2 


edge - format 
tl Vl . a-b-c- 

I v 2 a + d - e - 

V 3 o + f + e + 
V 4 b + f - d + 



vortex - format 
Vi. V 2 V 4 V 3 
V 2 . V! v 4 v 3 
V 3 . V! v 4 v 2 
v 4 . V, v 3 v 2 


8.8.4 FORMULAS FOR GENUS AND CROSSCAP NUMBER 


Definitions: 

The genus 7 m in(G) of a connected graph G is the minimum integer g such that there 
is an imbedding of G into the surface S g . 

The crosscap number 7 m ; n (G) is the minimum integer k such that there is an imbed- 
ding of G into iVfc. Thus, a planar graph has crosscap number zero. 

Facts: 


1. The genus of any planar graph is 0. 

2. 7min(G) > |£;| ~ 3 j y|+6 if G is Simple. 

3. 7 min(G) > — ^ if G is simple and bipartite. 


4. 7min(AT n ) = 

3. 7min(A m>yl ) = 
Tmin(Qn) 

7. 7mi„(G)> 

8. 7mi„(G)> 

9- 7min(^«) = 


(n — 3 )(n — 4) 

12 

(m — 2 )(n — 2) 


(m — 2 )(n — 2) 
4 

|£|-3|F| + 6 


|A| — 2|V| + 4' 


(n — 3)(n — 4) 

6 


(Ringel and Youngs, 1968) 

. (Ringel, 1965) 

. (Ringel, 1955) 

for every simple graph G. 

for every simple bipartite graph G. 

, except that ~/ lnin (K- 7 ) = 3. (Ringel, 1959) 


10. Many genus and crosscap number formulas can be derived by using voltage graphs 
or current graphs (§8.1.4). [GrTu87] 
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8.9 


ENUMERATING GRAPHS 


It is often valuable to know how many graphs there are with some desired property. 
Computer scientists can use such numbers in analyzing the time or space requirements 
of their algorithms, and chemists can make use of these numbers in organizing and 
cataloging lists of chemical molecules with various shapes. Many of the techniques for 
counting graphs were developed in the 1930s by the mathematician George Polya. 


8.9.1 COUNTING GRAPHS AND MULTIGRAPHS 
Definitions: 

A labeled graph is a graph with standard labels (commonly Vi,V 2 , ■ ■ ■ ,v n ) assigned 
to the vertices. Two labeled graphs with the same set of labels are considered the same 
only if there is an isomorphism from one to the other that preserves the labels. 

The pair-permutation induced by a permutation 7 on a set S is the permutation 
on the set of all subsets of S of size 2 defined by the rule 7^: {x, y} 1 ^”/[x),'y(y)}. 

The pair-action group T^ induced by a permutation group T on a set S is the group 

{7 (2) | 7 e r}. 

The ordered-pair-permutation 7PI induced by a permutation 7 on a set S is the 
permutation on the set S x S defined by the rule 7^: ( x,y ) 1 -(ry(x), "f{y))- 

The pair-action group T^ 2 ) induced by a permutation group T on a set S is the group 
{ 7 [ 2] I 7 € T}. 

Facts: 

1. The number of labeled simple graphs with n vertices and m edges is the binomial 

coefficient ( (2) 

\ m 

2. For m > . the number of labeled simple graphs with n vertices and m edges is 

the same as the number of labeled graphs with n vertices and — m edges. 

(;) 

3. The total number of labeled simple graphs with n vertices is 2 . See Table 1. 

4. The number C n of connected labeled simple graphs with n vertices can be deter- 
mined from the following recurrence system. See Table 2. 

Ci = l, C„ = 2 W -^i(")2 l!j C, for n > 1. 

i= 1 

5. Most (unlabeled) graphical structures are counted with generating functions, using 
Burnside-Polya enumeration (§2.6). In particular, the generating function for graphs 
with n vertices has the form 

(;) 

9n[x) Gn,i * X 

i= 0 

where G n ^ m denotes the number of graphs with n vertices and m edges. 

6. Polya enumeration involves permutations (j) of the set X n = {1,2, . . . , n}; jk de- 
notes the number of £:-cycles in (j), for k = 1, . . . ,n. For example, if (j) = (12)(34)(567), 
then j 2 = 2, j 3 = 1, and j x = j 4 = j 5 = j e = j 7 = 0. 
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Table 1 Labeled graphs with n vertices and m edges. 


m\n 

1 

2 

3 

4 

5 

6 

7 

8 

0 

1 

1 

1 

1 

1 

1 

1 

1 

1 


1 

3 

6 

10 

15 

21 

28 

2 



3 

15 

45 

105 

210 

378 

3 



1 

20 

120 

455 

1,330 

3,276 

4 




15 

210 

1,365 

5,985 

20,475 

5 




6 

252 

3,003 

20,349 

98,280 

6 




1 

210 

5,005 

54,264 

376,740 

7 





120 

6,435 

116,280 

1,184,040 

8 





45 

6,435 

203,490 

3,108,105 

9 





10 

5,005 

293,930 

6,906,900 

10 





1 

3,003 

352,716 

13,123,110 

11 






1,365 

352,716 

21,474,180 

12 






455 

293,930 

30,421,755 

13 






105 

203,490 

37,442,160 

14 






15 

116,280 

40,116,600 

total 

1 

2 

8 

64 

1,024 

32,768 

2,097,152 

268,435,456 


Table 2 Connected labeled graphs with n vertices. 



Here lcm(r, s) and gcd(r, s) are the least common multiple and greatest common divisor 
of r and s, respectively. The following lists explicit formulas for Z(Sn) for small values 
of n: 

z(s[ 2) ) = 1 

Z(S {2) ) = a, 

Z (4 2) ) = 31 ( a i + 3 a i a 2 + 2 a 3 ) 

Z(S^) = -jy (of + 9a 4 a 2 + 8a 3 + 6a 2 a 4 ) 

Z^S^) = ^ (a 4 ° + lOafal + 20a 1 ag + 15 a 4 a 2 + 30a 2 a4 + 20a 1 a 3 a 6 + 240^) 

Z(sf) = (a 4 5 + 15a^a 2 + 40a 4 a3 + 60a 4 af + 180a 1 a 2 a| + 12004020300 

+ 144og + 4003 + 120o 3 Og) . 
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Table 3 Graphs with n vertices and m edges. 



Table 4 Multigraphs with n vertices and m edges. 


m\n 

1 

2 

3 

4 

5 

6 

0 

1 

1 

1 

1 

1 

1 

1 


1 

1 

1 

1 

1 

2 


1 

2 

3 

3 

3 

3 


1 

3 

6 

7 

8 

4 


1 

4 

11 

17 

21 

5 


1 

5 

18 

35 

52 

6 


1 

7 

32 

76 

132 

7 


1 

8 

48 

149 

313 

8 


1 

10 

75 

291 

741 

9 


1 

12 

111 

539 

1,684 

10 


1 

14 

160 

974 

3,711 


8. The generating function g n (x) for counting n - vertex graphs by number of edges is 


(O') 

obtained from the cycle index Z(Sn by replacing each variable with 1 + x l . See 
Table 3. 


9. The total number G n of graphs with n vertices is obtained from the cycle in- 
dex Z(Sn ) by replacing each variable at with the number 2. 


10 . Asymptotically, the number G n of n - vertex graphs satisfies G n ~ — j— . 

11. The generating function m n (x) = '^M rl} ix' 1 for counting n - vertex multigraphs 

i 


(O) 

according to their number of edges is obtained from the cycle index Z(Sn ) by replacing 
each variable a* with the infinite series 1 + x 1 + x 2l + x 3t + • • ■ . See Table 4. 
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Examples: 

1. The three different simple graphs with 4 vertices and 3 edges are given in the fol- 
lowing figure. There are 4 essentially different ways to label each of the first two and 12 
ways to label the third. Thus, there are 20 different labeled graphs with 4 vertices and 3 
edges. The second and third graphs in this figure are connected. 

K Y. U 

2 . There are six different multigraphs with 4 vertices and 3 edges, namely, the three 
graphs displayed in the previous figure plus the three additional multigraphs displayed 
in the following figure. 



8.9.2 COUNTING DIGRAPHS AND TOURNAMENTS 


Definitions: 

A digraph (or directed graph) consists of a set V of vertices and a set A of arcs. 
When counting digraphs, two digraphs are considered the same if they are isomorphic. 

A labeled digraph is a digraph in which standard labels such as v\, V 2 , ■ ■ ■ , v n have 
been assigned to the vertices. Two labeled digraphs are considered the same only if 
there is an isomorphism from one to the other that preserves the labels. 

A tournament is a digraph such that for each pair u, v of vertices, either there is an 
arc from u to v or an arc from v to u, but not both. 

A tournament is strong (or strongly connected ) if for each pair u, v of vertices, 
there exist directed paths from u to v and from u to u. 


Facts: 


1. The number of labeled digraphs with no loops that have n vertices and m arcs is 

/n(n— 1)\ 

V m ) ' 


2. For m > n{n — 1), the number of labeled digraphs with n vertices and m arcs is the 
same as the number of labeled digraphs with n vertices and n{n — 1) — m arcs. 


3 . The total number of labeled digraphs with n vertices is 2 r b™ A See Table 5. 

(3) 

4 . The number of labeled tournaments with n vertices is 2 , the same as the number 

of graphs with n vertices. 


5 . Like graphical structures, most (unlabeled) digraphical structures are counted with 
generating functions, using Burnside-Polya enumeration. In particular, the generating 
function for digraphs with n vertices has the form 

n(n— 1) 

dn(x)= Y D n,iX l 
2—0 

where D n m denotes the number of digraphs with n vertices and m arcs. 
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Table 5 Labeled digraphs with n vertices and m arcs. 



[■ol 

6. The cycle index Z(S„ ) for counting digraphs is 

j-j- ^2 gcd(r ,s)j r js 


*<*>-* ^ h^n 


U)es n 


O' k XX ^lcm(r,s) 

r<s 


The following lists explicit formulas for Z(sffl) for small values of n : 


Z(S [ ? ] ) = 1 
Z(S l 2 ] ) = ^{a 2 1+ a 2 ) 

Z(SP ] ) = i(oS + 3^ + 2o§) 

Z (S'! 2 ' ) = ^ (a } 2 + 6 a 2 02 + 803 + 3a® + 604 ) 

Z(sf) = ^ (a 20 + 10a® 02 + 20a 2 a® + 15o2° + 30al + 20a 2 a 2 3 al + 24 af) 

Z(S g 2 ^) = (af° + 15a} 2 a2 + 40a® a® + 45a 2 a\ 4 + 90a 2 a| + 120a 2 a 3 al 

+ 144a® + 15a^ 5 + 90a 2 a^ + 40a^° + 120a^). 


7. The generating function d n (x) for counting n - vertex digraphs by number of arcs is 
obtained from the cycle index Z(Sn ) by replacing each variable a,; with 1 + x 1 . See 
Table 6. 


8. The total number D n of digraphs with n vertices is obtained from the cycle in- 

rol 

dex Z(Sn ) by replacing each variable Oj with the number 2. 

27i(n— 1) 

9. Asymptotically, D n satisfies D n ~ ; — . 

n\ 
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Table 6 Digraphs with n vertices and m arcs. 



Table 7 Tournaments and strong tournaments with n vertices. 


n 

tournaments 

strong tournaments 

i 

1 

1 

2 

1 

0 

3 

2 

1 

4 

4 

1 

5 

12 

6 

6 

56 

35 

7 

456 

353 

8 

6,880 

6,008 

9 

191,536 

178,133 

10 

9,733,056 

9,355,949 

11 

903,753,248 

884,464,590 

12 

154,108,311,168 

152,310,149,735 


10. The number T n of tournaments on n vertices is given by the formula 

T n = 1 i y'^— 2 d V\ 
no- 
where the sum is over all permutations (j) of X n whose cycles are all of odd size, and 
where 


D U) = 2 H gcd ( r ’ S ):/Vj s ^ 3k ) • 

\r— 1 s—1 k = 1 / 


See Table 7. 


11 . Let T{x) = x + x 2 + 2x 3 + 4a; 4 + 12a; 5 + 56a; 6 + • • • be the generating function 
for tournaments, from the formula of Fact 10. Then the generating function S^a;) = 
a; + a; 3 + a; 4 + 6a; 5 + 35a; 6 + • • • for strong tournaments can be computed from the relation 
S( x ) = i +t(x) ■ ^ ee Table 7. Note that there are no strong tournaments with exactly 
two vertices. 
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Examples: 

1. The four digraphs with 3 vertices and 3 arcs are displayed in the following figure. 
The first three digraphs can each be labeled in six essentially different ways, while the 
fourth digraph can only be labeled in two essentially different ways. Thus there are 20 
different labeled digraphs with 3 vertices and 3 arcs. 


^ A A 

2. The four tournaments with 4 vertices are displayed in the following figure. Only the 
last tournament is strong. 




8.10 ALGEBRAIC GRAPH THEORY 


8.1 0.1 SPECTRAL GRAPH THEORY 
Definitions: 

The characteristic polynomial of a graph G is the characteristic polynomial p(x) 
of its adjacency matrix Ac, that is, p(x ) = det {xl — A©. 

An eigenvector (or characteristic vector ) of a matrix A is a nonzero vector x 
such that Ax = \x , for some value A. 

An eigenvalue (or characteristic value ) of a matrix A is a number A such that 
Ax = Ax, for some vector 

An eigenvector of a graph is an eigenvector of its adjacency matrix. 

An eigenvalue of a graph is an eigenvalue of its adjacency matrix. 

The spectrum of a graph is the spectrum of its adjacency matrix, i.e., the multiset 
of eigenvalues. 

The Laplacian (or admittance matrix) of a graph G is the matrix Dq — Ac, 
where Dq is the diagonal matrix with the degree sequence of G on the diagonal and Aq 
is the adjacency matrix. 

A graph G is strongly regular with parameters (n, k, r, s) if: 

• \Vg\ = n; 

• G is fc-regular; 

• every adjacent pair of vertices is mutually adjacent to r other vertices; 

• every pair of nonadjacent vertices is mutually adjacent to s other vertices. 

By convention, strongly regular graphs are connected with at least one edge. 
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The Hoffman polynomial of a graph is a polynomial p(x) of minimum degree such 
that p(Aq) = J, where Aq is the adjacency matrix and where J is the square matrix 
with every entry equal to 1. 

A cospectral pair of graphs is a pair of nonisomorphic graphs that have the same 
spectrum. 

Facts: 

1. The eigenvalues of a graph are independent of the particular labeling of the vertices; 
thus, two isomorphic graphs have the same spectrum. 

2 . All the eigenvalues of a graph are real. This is a special case of the well-known linear 
algebra result that the eigenvalues of any Hermitian matrix are real. 

3. From linear algebra, it follows that the characteristic polynomial of a graph G sat- 
isfies the equation p(x) = n"=i( ;c ~~ A), where Ai, . . . , A n are the eigenvalues of G. 

4. If a graph is connected, then its largest eigenvalue has multiplicity 1. This eigen- 
value has a corresponding eigenvector with all positive entries, and it is the only such 
eigenvector. 

5. If A is the largest eigenvalue of a graph and ji is another eigenvalue, then A > |/t|; 
moreover, —A is an eigenvalue if and only if the graph is bipartite. 

6. A graph is bipartite if and only if its spectrum is symmetric with respect to 0, that 
is, A is an eigenvalue if and only if —A is also an eigenvalue. 

7. The largest eigenvalue of a fc-regular graph is k, and it has multiplicity equal to 
the number of connected components. The sum of the coordinates of an eigenvector 
corresponding to any other eigenvalue is 0. 

8. The (i,j) th entry of the fcth power A q of the adjacency matrix of a graph G is the 
number of walks of length k starting at vertex v% and terminating at Vj. 

9. If Ai, A 2 , . . . , A„ are the eigenvalues of a graph G, then X^"=i = 2 1 Eq where Eq 

is the edge-set of G. Also, l = 6T where T is the number of triangles in G. 

10. If p(x) = x n + a n -ix n ~ 1 +a n - 2 x n ~ 2 + ■ ■ - + aiX+ao is the characteristic polynomial 
of a graph G, then a n -\ = 0, — a„_ 2 is the number of edges, and —a n - 3 is the twice 
number of triangles. 

11. The set of eigenvalues of the disjoint sum G + H is the union of the sets of eigen- 
values of G and H. The multiplicity of A as an eigenvalue of G + H is the sum of the 
multiplicity of A as an eigenvalue of G and the multiplicity of A as an eigenvalue of H. 

12. The eigenvalues of the cartesian product G x H are { A,; + A j | A * an eigenvalue 
of G and A, an eigenvalue of H } . The multiplicity of A,; + A j as an eigenvalue of G x H 
is the product of the multiplicity of A as an eigenvalue of G and Xj as an eigenvalue 
of H. 

13. If G is a A;-regular graph and G is its complement, then A < k is an eigenvalue of G 
if and only if —A — 1 is an eigenvalue of G. In this case A and —A — 1 have the same 
multiplicities. 

14. If A is an eigenvalue of G with multiplicity to, then —A — 1 is an eigenvalue of G 
with multiplicity to — 1 , to, or to + 1. 

15. If G has n vertices and Ai > A2 > • • • > A„ as eigenvalues, and H is an induced 
subgraph with n — 1 vertices and eigenvalues Mi > M2 > ■ ■ ■ > p n ~i, then Ai > Mi > 
A 2 > M2 > ’ ’ ’ > Mrs-1 > X n . 

16. The eigenvalues of a line graph L(G) are greater than or equal to —2. Equality 
is attained unless every connected component of G is a tree or has exactly one circuit, 
that circuit being odd. 
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17. If G is a fc-regular graph with A > — k as an eigenvalue, then A + k — 2 is an 
eigenvalue of L(G). 

18. A graph has a Hoffman polynomial if and only if it is regular and connected. 

19. A regular connected graph has exactly three distinct eigenvalues if and only if it is 
strongly regular. 

20 . Matrix-tree theorem: If Mi is formed by deleting the i-th row and column from 
the Laplacian of G, then det(Mj) is independent of the choice of i and is equal to the 
number of spanning trees of G. 


Examples: 

1. The edgeless graph N n on n vertices has one eigenvalue, namely 0 with multiplicity n. 

2 . The eigenvalues of the complete graph K n are n — 1 and —1 with respective multi- 
plicities of 1 and n — l. For instance, the characteristic polynomial of K 4 is 


x -1 -1 -1 

-1 x -1 -1 

-1 -1 x -1 

-1 -1 -1 x 


(x + l) 3 (x — 3). 


3. The eigenvalues of the complete bipartite graph K m ,n are \Jrnn, 0, and — y/mn, with 
respective multiplicities of 1, mn — 2, and 1. 

4. The eigenvalues of the Petersen graph are 3, 1, and —2, with respective multiplici- 
ties 1, 5, and 4. 

5. The eigenvalues of the n-path P n are { 2 cos | fc = l,2,...,n}, each with mul- 
tiplicity 1. 

6. The eigenvalues of the n-cycle C n , are {2cos^ | k = 1,2, ... ,n}. The eigen- 
value 2, and the eigenvalue —2 when n is even, have multiplicity 1; all other eigenvalues 
have multiplicity 2. 

7. The eigenvalues of the hypercube Qd are d, d—2, d— 4, . . . , — d+2, —d, with respective 
multiplicities (jj) , © , (2) , • • • , (/ 1) , 0 ■ 

8. The eigenvalues of the line graph L(K n ) are 2 n — 4, n — 4, and —2, with respective 
multiplicities 1 , n — l, and n( n . 

9. The eigenvalues of the line graph L{K m ^ n ) are m + n — 2, m — 2, n — 2, and —2, 
with respective multiplicities 1, n — 1, m — 1, and (m — 1 )(n — 1). 

10. If G is strongly regular with parameters (n, k,r, s), then its eigenvalues are k and 
\ (r — s ± y/(r — s)' 2 — 4(s — k ) ) . 

11. The smallest pair of cospectral graphs is A'- ; 4 and C4 + Ki, each of which has 
spectrum {— 2, 0, 0, 0, 2}. See the following figure. Observe that K \ ^ is connected and 
that C4 + A'i is not, and that the two graphs have different degree sequences. This 
implies that connectedness and degree sequences cannot be determined from spectral 
properties alone. 
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8.1 0.2 AUTOMORPHISMS OF GRAPHS 


Definitions: 

The automorphism group Aut(G) of a graph G is the set of all automorphisms of 
graph G, under the operation of functional composition 

A generating subset for a group T is a subset E of group elements such that every 
group element is a product of elements of E. (Note: The group identity is the empty 
product.) 

The Cayley digraph for group T and generating set E has as vertices the elements 
of T, with an arc <r T from the vertex 7 to the vertex Y if and only if yo = Y ■ 

The Cayley graph for group T and generating set E is the graph obtained by 
removing all arc directions from the Cayley digraph. 

The Cayley graph for group T and generating set E (alternative definition) is 
the graph obtained by removing all arc directions from the Cayley digraph, and by 
collapsing each pair of arcs corresponding to a generator of order two to a single edge. 

Facts: 

1. A simple graph G and its edge-complement G have the same automorphism group. 

2. An automorphism p of a graph G induces an automorphism p on the line graph 
L(G). 

3. If G is a connected simple graph with at least 4 vertices, then G and its line 
graph L(G) have isomorphic automorphism groups. 

4. If the G graph has adjacency matrix A , and if the permutation tp of Vq has per- 
mutation matrix P, then p is the vertex map of an automorphism of G if and only if 
PA = AP. 

5. If all eigenvalues of a graph G have multiplicity 1, then every automorphism has 
order at most 2. 

6. Frucht’s theorem: Let T be any finite group. Then there exists a graph G whose 
automorphism group is isomorphic to T. It can be constructed by modifying a Cayley 
digraph for T. 

Examples: 

1. The n - vertex edgeless graph N n and the complete graph K n both have the symmetric 
group S n as their automorphism group. They are the only n-vertex graphs with this 
automorphism group. 

2 . The automorphism group of the complete bipartite graph K mn is S n x S m if n Y m 
and is the wreath product [R088] S n l S2 if m = n. 

3. The automorphism group of the n-path graph P n (with n > 1) is isomorphic to S'2. 

4. The automorphism group of the n-cycle graph C n is the dihedral group D n of 
order 2 n. For instance, the 4-cycle C4 with vertices a, b, c, and d (in cyclic order), has 
the following vertex automorphisms: 

(a) (b) (c) (d) (abed) (a c)(b d) (a d c b) 

(a b)(c d) (a d)(b c ) (a)(c)(b d) (b)(d)(a c). 
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5. The Cayley digraph of the group S 3 with generating set {(1 2 3), (1 2)} is illustrated 
in the following figure. 


$ 


(1)(2)(3) 


/ 7 M 2)(3) 


(1 2 3) 




3) (2) 

(13 2 ) 


8.11 ANALYTIC GRAPH THEORY 

Analytic graph theory involves three different perspectives on the properties of graphs 
that are sufficiently “dense”. One analysis is what must happen in a simple n - vertex 
graph when the number of edges is sufficiently large. A second analysis is what must 
happen in at least one of the parts of a partition of the edges of a graph. The third 
analysis is what happens with a high probability when a graph is randomly chosen 
according to some distribution. 


8.1 1 .1 EXTREMAL GRAPH THEORY 

Extremal graph theory is the analysis of the number of edges an n— vertex simple graph 
must have in order to guarantee that it contains a certain graph or type of graph. Else- 
where, it is sometimes taken to be the study of graph-theoretic inequalities in general. 

Definitions: 

The extremal number ex(Q\ n) for a set Q of graphs is the greatest number of edges in 
any simple graph with n vertices that does not contain some member of Q as a subgraph. 
Notation : The notation ex(G\ n ) is used when Q consists of just one graph G. 

An extremal graph for a set Q of graphs and an integer n is a graph with n vertices 
and ex(Q ; n) edges that contains no member of Q. 

The Turan graph T/-(n) is the n - vertex k - partite simple graph with the maximum 
number of edges. 

The Turan number tk(n) is the number of edges in the Turan graph Tk(n). 

Facts: 

1. If ex(Q\ n) = ( 2 ), then no graph with n vertices contains any member of Q . 

2. The Turan graph Tk(n) is the unique complete fc-partite graph with the property 
that the numbers of vertices in any two of its parts differ by at most 1. In the special 
case k = 2, 12 ( 71 ) = I\ Uj/ 21 , [W 2 ] • More generally, if n = tk + r, where 0 < r < k, then 
there are r parts of size t + 1 and k — r parts of size t. 

3. The Turan number tk(n) equals ( 2 ) + 1 — t (p~^+ r ) , w here n = tk+r, with 0 < r < k. 
If k = 2, this greatly simplifies: f 2 (n) = |_§ J |"f] = LxJ- 

4. Turan’s theorem: ex(Kk',n) = tk-i{n)] furthermore, Tk-i(n) is the only extremal 
graph for Kk and n. 
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5. Let x = x(G) (chromatic number of G, § 8 . 6 ), p = |G|, c = 2 — I. Then ex(G\ n) = 

^1 — ( 2 ) T 0(n c ). Furthermore, all the extremal graphs differ from the Turan 

graph T x _i(n) by adding and deleting 0(n c ) edges, and the minimum degree of all such 
graphs is (1 — ^©j)n + 0{n c ). (Erdos, Simonovits) 

6. Fact 5 is also true for ex{Q\n ), where \ is the smallest chromatic number among 
the members of Q, p is the smallest order among these members, and c = 2 — I . 

7. ex(Q; n) = 0(n) if and only if Q contains a (tree or) forest. 


8 . There exists a number t 0 such that, for t, > to, every tree T of order t satisfies the 

inequality ex(T] n) < 2 - > for every n > t + 1. 

9. ex(Ci',n) = |(n 3 / 2 ) + 0(n 4 / 3 ). (The exponent | can be slightly improved.) 

10. ea:(G 2 m ; n) = 0(n 1+1 / fc ). This is known to be sharp only for 2m = 4, 6, 10, but is 
conjectured to be sharp for all to. 

11 . The ratio is monotone nonincreasing; that is, for every set Q and for all 

v 2 / 


™ ^ „ ex(g-,m) ^ ex{Q',n) 

m<n, (?) > (;) • 

12. The following table summarizes many other facts that apply as the number of edges 
grows: 


# edges 

what must occur, but not 
for smaller # edges 

what must occur if n is 
large enough 

n 

some cycle 


3n— 1 

L 2 J 

some even cycle 


3n — 5 

two disjoint cycles 


t 2 {n) + 1 = [©J + 1 

some odd cycle (i.e., % > 3), 
C 3 , • • • , G[( n+3 )/ 2 j 

K s ,s + e for fixed s 

t 2 (n) + to, to fixed 


to[t( J copies of C 3 , for fixed s 
K a s plus to extra edges 

ffc(n) + 1 

Kk ; also, % > k 


ffc(n) + to, to fixed 


for fixed s, dv s;n plus to 
extra edges 

( 2 ) ^ n + 3 

a Hamilton cycle 



Examples: 

1 . ex(K 2 ; n) = 0. The extremal graph is the edgeless graph. 

2 . ex{P 2 \ n ) = (!) I . The extremal graph is the maximum matching. 

3. ex{K\ r \ n) = If (r — l)n is even, then any (r — l)-regular graph is an 

extremal graph. If (r — l)n is odd, then any graph with one vertex of degree r — 2 and 
all the others of degree r — 1 is extremal. 

4. ex(Ko\n) = [qrj- The Turan graph T 2 (n) is the only extremal graph. 

5. The Turan graph T 3 (10) is the 3-partite graph K 3 ^ 4 . It has 33 edges, which is more 
than any other 3-partite graph on 10 vertices. Thus, ex(iv 4 ; 10) = 33. 
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8.1 1 .2 RAMSEY THEORY FOR GRAPHS 

If the edges of a “dense” graph are partitioned into two parts, then at least one of the 
parts must still be fairly dense. Ramsey theory, which can also be studied in connection 
with many mathematical objects other than graphs, relies on this idea. (Also see §3.1.6.) 

Definitions: 

The ( classical ) Ramsey number r(m,n ) is the smallest positive integer k such that 
every fc-vertex graph contains either the complete graph K rn or n mutually nonadjacent 
vertices. 

The Ramsey number r(G , H) is the smallest positive integer k such that, if the 
edges of Kj. are bipartitioned into red and blue classes, then either the red subgraph 
contains a copy of G or else the blue subgraph contains a copy of H. Sometimes r(G) 
denotes r(G, G). 

The Ramsey number r(G i, . . . , G s ) is the smallest number k such that in any s-fold 
partition of the edgeset of A© there is an index j such that the jth part contains the 
graph Gj. 

A k-canonical coloring of a complete graph is an edge-coloring in which the vertices 
can be partitioned into k or fewer parts, such that the color of each edge depends only 
on the two parts to which its endpoints belong. 

The arrows notation F—>(G,H) (“F arrows (G, IT)”) means that if the edges of 
the graph F are partitioned into two chromatic classes, e.g., into red edges and blue 
edges, then either the red subgraph contains a copy of G or else the blue subgraph 
contains a copy of H. When G = H, the notation F— >G is often used. The notation 
F— >(Gi, . . . , Gk) means that k edge colors are involved. 

Facts: 

1. r(K m , K n ) = r(m, n) for all m, n > 1. 

2. r(G, H) = r(H , G). That is, Ramsey numbers are symmetric. 

3. r(K n , Ki) = r(Ki,K n ) = 1 for every n > 1. 

4. r(K n , K 2 ) = r(I< 2 , K n ) = n for every n > 1. 

5. r(K m , K n ) < r(K m , K n _ i) + r(K m _i, K n ) for all m, n > 2. 

6. r(K m ,K n ) < ( m +(© 2 )- (Erdos and Szekeres, 1935) 

7. If n > 3, then 2"/ 2 < r(K n , K n ) < ( 2 n n + 2 ) < 4" +1 . 

8. ^(1 + o(l))n2"/ 2 < r(K n , K n ) < ( 2 n "+ 2 ) • 0((logn)- 1 ). 

9. There exist constants C\ and c 2 such that ci?rlnn < r(/\ 3 , K n ) < c 2 nlnn. 

10. A 1-canonical coloring assigns every edge the same color. 

11. A 2-canonical coloring consists of two complete edge-monochromatic subgraphs, 
such that all edges joining them are of the same color. 

12. If x(G) = X an d \Vh\ = n, then r(G,H) > (x — l)(n — 1) + 1. This fact is based 
on a (x — l)-canonical coloring. 

13. If T is an n-vertex tree, then r(K m , T ) = (to — l)(n — 1) + 1. In other words, the 
lower bound in the immediately preceding fact determines the Ramsey number. 

14. Except for r{C^, C 3 ) = r(C 4 , C 4 ) = 6, r(G m , C n ) and r(P m , C n ) are determined by 
the best possible 2-canonical colorings, which are easy to find. 
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15. For every choice of graphs Gi, G 2 , ■ • • , Gfc, there exists a graph F for which F — > 
(Gi, . . . , Gfc). In particular, the Ramsey number r(Gi, . . . , Gk) is well-defined. 

16. If m, n > 3, the values of only nine Ramsey numbers are known: 

r(3,3)=6 r(3,4) = 9 r(3,5) = 14 

r(3, 6 ) = 18 r(3, 7) = 23 r(3, 8 ) = 28 

?’(3, 9) = 36 r(4, 4) = 18 r(4,5) = 25. 

Estimates on some other Ramsey numbers are given in §3.1.6. In addition to the nine 
exact results, only one other nontrivial Ramsey number for complete graphs is known: 

r(K 3 ,K 3 ,K 3 ) = 17. 


8.1 1 .3 PROBABILISTIC GRAPH THEORY 

Probabilistic graph theory takes two basic directions. It studies random graphs for 
themselves, and it uses random graphs in deriving graph-theoretical results that are not 
themselves probabilistic. 

Definitions: 

In Model 1, the random graph G n , p has n distinctly labeled vertices v\,. . . , v n , and 
the probability of any pair of vertices being joined by an edge is p , where all these edge 
probabilities are mutually independent. 

In Model 2, the random graph G„ ie has n distinctly labeled vertices iq, . . . , v n , and 
exactly e edges, and each such labeled graphs occurs with the same probability 1 / (^) , 
where N = (™). 

Almost every (a. e.) graph has a given property P under either Model 1 or Model 2, 
if the probability that a random graph has property P approaches 1 as n — » 00 , where 
the probability p stays constant under Model 1, but where one must specify how e varies 
with n under Model 2. If neither model is explicitly specified, then Model 1 with p = | 

is implicit, so that all labeled graphs on n vertices have the same probability 2 ^( 2 ). 

Facts: 

1 . The number of labeled graphs in the probability space for Model 1 is 2 ( 2 ). 

2 . While Model 2 is sometimes considered to be more natural and easier to define, it 
is, in practice, usually easier to work with Model 1. Fortunately, Model 1 with p = 
behaves very similarly to Model 2 in most cases, so that facts about Model 1 usually 
lead easily to facts about Model 2 as well. 

3. In Model 1, a graph with e edges occurs with probability p e ( 1 — p)( 2 ) -e . If p = 5 , 
then every labeled graph on n vertices has the same probability 2 - ( 2 ). 

4. Random graphs can be used to prove theorems about graphs, especially existence 
theorems. (See Example 1.) 

5. Let b = ^ and d = 21og & 2 ^ ” n = 1, where e = 2.718 ... , not the number of edges. 
Then for every positive e < \ the clique number of a. e. graph is either [d — ej or [d + e J , 
where these two values are usually the same when e is small. This means that the clique 
number is determined for a. e. graph, unless d is close to an integer, in which case there 
are two possible values. 
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Algorithm 1: Generate random graph G n)P (per Model 1). 

initialize graph G with vertex list v\ , v 2 , . . . , v n 
for i := 1 to n — 1 
for j := i + 1 to n 

join vertices Vi and Vj with probability p 


Algorithm 2: Generate random graph G n ^ e (per Model 2). 

initialize graph G with vertex list V\,V 2 , ■ ■ ■ ,v n 
generate random integer r £ 

convert r to an e-combination C in {l, . . . , Q)} 
convert e-combination C to e edges in G„ je 


6. Almost every graph satisfies x ~ 2 lo ” 2 n ■ 

7. In connection with the fact immediately preceding, it can be shown that if p = n~ a 
for fixed a > |, then in Model 1, there exists an f(n,p) so that for almost every graph, 
f(n,p) < X < f(n,p) + 3. That is, the chromatic number x takes on one of only four 
possible values. 

8. Almost every graph in Model 1 has its connectivity and its edge connectivity equal 
to its minimum degree. Furthermore, the common value of these three parameters is 
pn — (2p(l — p)n log n) 2 + o(n log n ) 2 . 

9. Generating a random graph G„ jP under Model 1 is straightforward, as indicated by 
Algorithm 1. 

10 . To generate a random graph G n>e under Model 2, the possible edges are placed 
in bijective correspondence with the integers 1, . . . , (”) according to the rule f(i,j) = 
( 2 ) ~ (" _ 2 +1 ) + J- Also, the e-combinations of the integers 1, . . . , (^) are placed in 

bijective correspondence with the integers 1, (W) according to the lexicographic 
ordering of those e-combinations (§2.2.5). These bijections facilitate the formulation of 
Algorithm 2. 

11. For every fixed s, almost every graph contains the complete graph K s . Moreover, 
for every fixed graph H , almost every graph contains H. 


Table 1 Properties of almost every n-vertex graph. 


p under Model 1 

e under Model 2 

property of almost every graph 

oik) 

o{n) 

no cycles 

0 < c < i 

cn, 0 < c < i 

cycles are possible, and the largest 

n ’ Z 

7 2 

component has order ss In n 

1 

n 

some cycle exists, and the largest 

n 

2 

component has order 0(n 2//3 ) 

2c c> i 

n ’ ( ^ 2 

cn, c > \ 

the largest component has order c'n 

clnrr <1 

n ’ 

|nlnn, c < 1 

the graph is disconnected 

c In n ^ 1 

n ’ u ^ x 

|nlnn, c > 1 

the graph is connected and Hamiltonian 
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Example: 


1 . Using random graphs to prove theorems: Here is a proof that the Ramsey number 
r(K n , K n ) is greater than 2 n / 2 for all n > 3. Consider a random red-blue edge-coloring 
of Kn for some N > n with p(red ) = The probability that any given K n occur- 

-(;) 

ring within this 2-colored Kn is entirely red is 2 .Of course, the probability that 
it is colored blue is the same. Thus, the probability that the given subgraph K. n is 


-00 


< N\ 


monochromatic in either color is 2 . Since there are 

the colored Kn, the expected number of monochromatic I\ n is (^) • 2 


different copies of K n in 
n \ o 1 -^) 


With the choice of N = |_2™/ 2 J, this expectation is (^) -2 2 < 2l +" /2 • ^ /2 < 1, 

i.e., less than 1. Therefore there must be some coloring with no monochromatic K n at 
all. This completes the proof. 


8.12 HYPERGRAPHS 


In ordinary graph theory, an edge of a simple graph can be regarded as a pair of vertices. 
In hypergraph theory, an “edge” can be regarded as an arbitrary subset of vertices. In 
this sense, hypergraphs are a natural generalization of graphs. Their systematic study 
was initiated by C.Berge. They have evolved into a unifying combinatorial concept. 


8.12.1 HYPERGRAPHS AS A GENERALIZATION OF GRAPHS 
Definitions: 

A hypergraph H = ( V, E) is a finite set V of “vertices” together with a finite multiset E 
of “edges” (sometimes, “hyperedges”), which are arbitrary subsets of V. 

The order of a hypergraph edge is its cardinality. 

A partial hypergraph (or simply a partial) of the hypergraph H = (V, E) is a 
hypergraph H' = (V, E') such that E' C E. This generalizes a spanning subgraph. 

A hypergraph H = (V, E) is simple if E has no repeated edges. 

The incidence matrix of a hypergraph H = (V,E) with E = { e\, e^, ■ ■ . , e m } and 
V = { X\,X 2 , , . . , x n } is the m x n matrix M(H) = [■ rriij \ with 

_ f 1 if € e» 

TOj j = < J 

10 otherwise. 

The dual hypergraph of the hypergraph H is the hypergraph H* whose incidence 
matrix is the transpose of the incidence matrix M(H). This concept of duality from 
block design theory differs from the Poincare dual of graph theory. 

The degree deg(x) of a hypergraph vertex x is the number of hypergraph edges con- 
taining x. 

A hypergraph is regular if all vertices have the same degree. If t is the common value 
of the degrees, then the hypergraph is t-regular . 
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A hypergraph is uniform if all edges have the same number of vertices. If r is the 
common value, then the hypergraph is r- uniform. 

The complete hypergraph K* has all subsets of n vertices as edges, so that it has 2 n 
edges. 

The complete r-uniform hypergraph K' n is the simple hypergraph of order n with 
all r-element subsets as edges, so that it has (") edges. 

The intersection graph 1(H) of the hypergraph H is a simple graph whose vertices 
are the edges of H. Two vertices of 1(H) are adjacent if and only if the corresponding 
edges of H have nonempty intersection. 

An independent set of vertices in a hypergraph is a set of vertices that does not 
(completely) contain any edge of the hypergraph. 

Facts: 

1. How to draw a hypergraph: First draw the vertices and the hyperedges of order 2, as 
if they were vertices and edges, respectively, of a graph. Then shade triangular regions 
corresponding to hyperedges of order 3. Higher order hyperedges and hyperedges of 
order 1 can be indicated by drawing enclosures around their vertices. 

2 . Every hypergraph satisfies the generalized Euler equation for degree-sum: 

E deg(x) = E |e|- 

xeV eGE 

3 . Every simple graph is a 2-uniform simple hypergraph. 

4 . The intersection graph of a hypergraph generalizes the line graph L(G) of a graph G. 
(See §11.1.) 

5. Every graph is the intersection graph of some hypergraph. 

6. Every graph of order n is isomorphic to the intersection graph of a hypergraph of 
order at most |_qr_r 

7. When a graph G is regarded as a hypergraph, its dual is a hypergraph whose inter- 
section graph is G. 

Examples: 

1. The hypergraph H = (V. E) with V = {a, 6, c, d} and E = {ab,bc,bd,acd,c} can 
be illustrated as follows: 



2 . The hypergraph of Example 1 has the following incidence matrix: 

abed 
ab 110 0 

acd 10 11 

be 0 110 

bd 0 10 1 
c 0 0 1 0 
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3. The dual of the hypergraph of Example 1 has the following incidence matrix: 

v ( ab ) v ( acd ) v (be) v ( bd ) v (c) 
e (a) 1 1 0 0 0 

e (b) 1 0 1 1 0 

e (c) 0 1 1 0 1 

e(d) 0 1 0 1 0 

This dual hypergraph may be illustrated as follows: 



4. The hypergraph of Example 1 has the following intersection graph: 



8.12.2 HYPERGRAPHS AS GENERAL COMBINATORIAL STRUCTURES 


Definitions: 

A transversal (or cover or blocking set) in a hypergraph is a set of vertices that has 
nonempty intersection with every edge of the hypergraph. 

A system of distinct representatives (SDR) in a hypergraph H = ( V , E) with 
E = { ei, e 2 , • . . , e m } is a transversal of m distinct vertices X\, X 2 , ■ ■ ■ , x m such that 
Xi € e* for i = 1, . . . , m. 

Hall’s condition on a hypergraph is that, for each t. = 1, . . . ,m the union of every 
subset of t edges have at least t vertices. Thus, each partial must have at least as many 
vertices as edges. 

A matching in a hypergraph is a set of pairwise disjoint edges. 

An antichain is a hypergraph in which no edge contains any other edge. 

A chain is a simple hypergraph in which, given any pair of edges, one edge contains 
the other. 

A symmetric chain in an n-vertex hypergraph H is a chain with edges of order 
| — t , . . . , ^ + t for some t > 0. 

A downset (or ideal ) is a simple hypergraph in which every subset of every edge is 
also an edge of the hypergraph. 

An upset (or filter) is a simple hypergraph in which every superset of every edge is 
also an edge of the hypergraph. 
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A hypergraph clique is a simple hypergraph such that every pair of edges has 
nonempty intersection. 

An r-partite hypergraph is an r-uniform hypergraph whose vertex set can be parti- 
tioned into r blocks such that each edge intersects each block in exactly one vertex. 

A hypergraph is unimodular if the determinant of every square submatrix of its inci- 
dence matrix is equal to 0, 1, or —1. 

An n-vertex hypergraph is an interval hypergraph if its vertices can be labeled 
1 , 2, . . . , n so that each edge is labeled by consecutive numbers. 

Facts: 

1. Hall’s condition is necessary and sufficient for the existence of an SDR in a hyper- 
graph. 

2. Sperner’s lemma: If the hypergraph H with n vertices and m edges is an antichain, 
then m < (^2j)- 

3. If the hypergraph H with nii edges of order i for i = 1, . . . , n is an antichain, then 

t mCir' < i. 

i—0 

4. The complete hypergraph K* can be partitioned into symmetric chains. 

5. Kleitman’s lemma: Let D and U be hypergraphs on the same n vertices. Let D 
be a d-edge downset and U a it-edge upset. And let D and U have m common edges. 
Then du > 2 "to. 

6. An n-vertex hypergraph clique has at most 2 n_1 edges. 

7. An r-uniform n-vertex hypergraph clique n has at most edges if n > 2 r. 

8. In any r-uniform hypergraph H, the maximum size r-partite partial hypergraph 
contains at least 0 of the edges of H. 

9. Let H be an n-vertex, m-edge hypergraph clique, such that each pair of distinct 
edges intersect in exactly one vertex. Then m < n. (deBruijn and Erdos) 

10. Fisher’s inequality: Let H be an n-vertex, m-edge hypergraph clique such that 
each pair of edges intersect in A vertices. Then in < n. 

11. Modular intersection theorem: Let L be a set of s integers, and let p be a prime 
number. Let H be an r-uniform hypergraph such that r ^ Imodp and that the 
intersection size for each pair of distinct edges is in Imodp. Then m < ("). 

Examples: 

1. The Fano plane (§12.1.1) is the hypergraph with (using mod 7 arithmetic): 

V= {1,2,..., 7} and E= { { 1 + i, 2 + 1,4 + I } | 1 < i < 7}. 

2. A block design is a regular, uniform hypergraph such that each pair of vertices is 
contained in precisely A edges. Block designs often provide extremal examples in various 
extremal problems of hypergraph theory. 

3. A matroid (§12.4.1) can be regarded as a hypergraph such that under every non- 
negative weighting of the vertices, a greedy algorithm could find an edge of maximum 
weight. 


© 2000 by CRC Press LLC 



8.12.3 NUMERICAL INVARIANTS OF HYPERGRAPHS 

Calculating formulas for the values of some standard numerical invariants of hyper- 
graphs tends to be quite difficult, even for complete hypergraphs. Two famous examples 
are Lovasz’s proof of the Kneser conjecture and Baranyai’s proof of the factorization 
theorem. 

Definitions: 

The maxdegree A (H) is the largest degree of any vertex in the hypergraph H. 

The chromatic number x(£f) is the smallest number of independent sets required to 
partition the vertex set of H. To ensure the existence of such partitions it is assumed 
that H does not contain any edges with just one vertex. 

The independence number a(H) is the maximum number of vertices which form an 
independent set in H. 

The chromatic index q(H) is the smallest number of matchings required to partition 
the edges of H. 

A hypergraph H is normal if q(H) = A (if). 

The transversal number t(H) is the minimum cardinality (i.e., number of vertices), 
taken over all transversals of H. 

The matching number v(H) is the maximum number of pairwise disjoint edges of H , 

i.e., the cardinality of the largest partial of H which forms a matching. 

The clique partition number cp(H) is the smallest number of cliques required to 
partition the edge set of H. 

The clique number lo(H) is the largest number of edges of any partial clique in the 
hypergraph H . 

Facts: 

1. Many hypergraph invariants are representable as graph invariants. In particular, 

u(H) = u>(I(H)), v(H) = a(I(H)), q(H) = cp(H) = X {W)) 

where G denotes the edge-complement of a graph G. 

2. Every hypergraph H satisfies the following two min > max relations: 

q{H) > cp{H) > A(ff) r(ff) > cp{H ) > v{H) 

3. A hypergraph H is normal if and only if r(ff') = v(H') for all partials H' of H. 
(Lovasz, 1972) 

4. The following relations hold in every n- vertex hypergraph H: 

T (H)=n-a(H), x{H) + a(H) < n + 1 . 

5. The parameters X (H) and t(H) can be approximated by greedy algorithms. 

6. The Kneser conjecture that cp{KT lr+k ) = k + 2 was proved by topological methods. 
(Lovasz and Barany, 1978) 

7 . The factorization theorem that q(KJl r ) = ^Ii) was proved by using network flows. 
(Baranyai, 1975) 

8. Hypergraphs in the following classes are known to be bicolorable (i.e., x(if) = 2): 
normal hypergraphs (including unimodular hypergraphs), r-uniform hypergraphs with 
size at most 2 r_1 , r-uniform hypergraphs in which each edge intersects at most 2 r ~ 3 
other edges (proved by probabilistic methods), finite planes of order at least three. 
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Examples: 


1. Consider the hypergraph H of §8.11.1 Example 1 with V = {a,b,c,d} and E = 
{ab,bc,bd,acd,c}. The maximum degree A (H) is 3, since vertex c has degree 3. The 
chromatic number x{H) is 4, since every pair of vertices lies in some edge, so all four 
vertices must get different colors. The independence number a(H) is 1, since every pair 
of vertices lies in some edge. The chromatic index q{H) is 4, using the matching c,ab. 


ab 


acd- 


,bd 

r i 


be 

<*)c 


The hypergraph H is not normal, since q(H) = 4, but A (H) = 3. The transversal 
number t(H) is 2, using the transversal b, c. The matching number v{H) is 2, using the 
matching ab, c. 


2. The Fano plane (§8.12.2 Example 1) has the following parameters: oj = q = 7, 
A = t = x = 3, a = 4, v = cp = 1. 
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INTRODUCTION 


A tree is a connected graph containing no cycles. Trees have applications in a wide 
variety of disciplines, particularly computer science. For example, they can be used to 
construct searching algorithms for finding a particular item in a list, to store data, to 
model decisions and their outcomes, or to design networks. 


GLOSSARY 

ancestor (of a vertex v in a rooted tree): any vertex on a path to v from the root. 

m-ary tree: a rooted tree in which every internal vertex has at most m children. 

backtrack: a pair of successive edges in a walk where the second edge is the same as 
the first, but traversed in the opposite direction. 

balanced tree: a rooted m-ary tree of height h such that all leaves of the tree have 
height h or h— 1. 

bihomogeneous tree: a tree (usually infinite) in which there are exactly two values 
for the vertex degrees. 

binary search tree: a type of binary tree used to represent a table of data, which is 
efficiently accessed by storage and retrieval algorithms, abbreviated BST. 

binary tree: an ordered rooted tree in which each vertex has at most two children, 
that is, a possible “left child” and a possible “right child”; an only child must be 
designated either as a left child or a right child (this usage is normative for computer 
science); in pure graph theory, an m-ary tree in which m = 2. 

bounded tree: a (possibly infinite) tree of finite diameter. 

breadth-first search: a method for visiting all the vertices of a graph in a sequence, 
based on their proximity to a designated starting vertex. 

caterpillar: a tree that contains a path such that every edge has one or both endpoints 
in that path. 

center (of a tree): the set of vertices of minimum eccentricity. 

child (of a vertex v in a rooted tree): a vertex such that v is its immediate ancestor. 

chord : for a graph G with a spanning tree T, an edge e of G such that e fcT. 

complete binary tree: a binary tree where every parent has two children and all 
leaves are at the same depth. 

decision tree: a rooted tree in which every internal vertex represents a decision and 
each path from the root to a leaf represents a cumulative choice. 

dense graph: a graph in which the number of edges far exceeds the number of vertices. 

depth (of a vertex in a rooted tree): the number of edges in the unique path from the 
root to that vertex. 

depth-first search: a method for visiting every vertex of a graph by progressing as 
far as possible from the most recently visited vertex, before doing any backtracking. 

descendant (of a vertex v in a rooted tree): a vertex that follows v on a path from 
the root. 

diameter (of a tree): the maximum distance between two distinct vertices in the tree. 

distance (between two vertices in a tree): the number of edges in the unique simple 
path between these vertices. 
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eccentricity (of a vertex in a connected graph): the length of the longest simple path 
beginning at that vertex. 

finite tree: a tree with a finite number of vertices and edges. 

forest: a graph with no cycles. 

full m-ary tree: a rooted tree in which every internal vertex has exactly m children. 

fundamental cycle of a connected graph G: the unique cycle created by adding 
the edge e € Eg not in T to a spanning tree T. 

fundamental edge-cut of a connected graph G: the partition-cut (X-\ . X->) where 
X\ and Xi are the vertex-sets of the two components of T — e, where e is an edge 
of a spanning tree T for G. 

fundamental system of cycles of a connected graph G: the set of fundamental 
cycles corresponding to the various edges of G — T, where T is a spanning tree for G. 

fundamental system of edge-cuts of a connected graph G: the set of fundamen- 
tal edge-cuts that result from removal of an edge from a spanning tree T for G. 

geodesic (between two vertices in a tree): the unique simple path between these ver- 
tices. 

heap: a representation of a priority tree as an array. 

height (of a rooted tree): the maximum of the levels of its vertices. 

homogeneous: property of a tree (usually infinite) that every vertex has the same 
degree. 

d-homogeneous: property of a tree (usually infinite) that every vertex has degree d. 

infinite tree: a tree with an infinite number of vertices and edges. 

inorder traversal (of an ordered rooted tree): a recursive listing of all vertices starting 
with the vertices of the first subtree of the root, next the root vertex itself, and then 
the vertices of the other subtrees as they occur from left to right. 

internal vertex (of a rooted tree): a vertex with children. 

isomorphism (of trees): for trees X and Y, a pair of bijections fv'-Yx — ► Vy and 
Je- Ex — > Ey such that if u and v are the endpoints of an edge e in the tree X, 
then fv(u) and fv{v) are the endpoints of the edge /^(e) in the tree Y (see §8.1). 

isomorphism (of rooted trees): for rooted trees (Ti,ri) and (T 2 , rq), a tree isomor- 
phism f:Ti — * T 2 that takes rq to rq . 

labeled tree: a tree with labels such as v\,V2, ■ ■ ■ , v n assigned to its vertices. 

leaf : in a rooted tree, a vertex that has no children. 

left child (of a node in an ordered, rooted binary tree): the first child of that node. 

left subtree (of an ordered, rooted binary tree): the tree rooted at a left child. 

left-complete binary tree: a binary tree where each level except possibly the deepest 
is filled and the bottom level has no gaps as one traverses left to right. 

left-right tree: a binary tree in which each vertex is a parent to either no children or 
to both a left and a right child. 

level (of a vertex in a rooted tree): the length of the unique path from the root to this 
vertex. 

locally finite tree: a tree in which the degree of every vertex is finite. 
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maximal tree (in a graph): a spanning tree. 

mesh (of trees): a graph obtained by construing each row and each column of a 2 d x 2 d 
array of vertices as the leaves of a complete binary tree. 

minimum spanning tree (of a graph whose edges have weights assigned to them): a 
spanning tree with minimum total edge- weight. 

nth level (of a rooted tree): the set of all vertices at depth n. 
order (of a finite tree): the number of vertices in the tree. 

ordered tree: a rooted tree in which the children of each internal vertex are linearly 
ordered. 

parent (of a vertex v, other than the root, in a rooted tree): a vertex that is the 
immediate predecessor of v on the unique path from the root to v. 

partition-cut of a graph: given a partition of the set of vertices of G into X\ and X2, 
the set (dfi, X2) of edges of G that have one endpoint in Xi and the other in X-2- 

postorder traversal: a recursive listing of the vertices in an ordered rooted tree 
starting with the vertices of subtrees as they occur from left to right, followed by 
the root. 

preorder traversal: a recursive listing of the vertices in an ordered rooted tree start- 
ing with the root, then the vertices of the first subtree, followed by the vertices of 
other subtrees as they occur from left to right. 

priority tree: a left-complete binary tree whose vertices have labels (from an ordered 
set) called “priorities”, such that no vertex has higher priority than its parent. 

reduced tree: a tree with no vertices of degree 2 . 
reduced walk: a walk in a graph without backtracking. 
regular: Synonym for homogeneous, 
d-regular : Synonym for d-homogeneous. 

right child (of a node in an ordered rooted binary tree): the second child of that node. 
right subtree (of an ordered, rooted binary tree): the tree rooted at the right child. 
rooted tree: a tree in which one vertex is designated as the “root”. 

semi-homogeneous tree: a bihomogeneous tree (usually infinite) with a partition of 
the vertices into two sets, those of degree m and those of degree n, where each vertex 
of degree m is adjacent to one of degree n. 

siblings (in a rooted tree): vertices with the same parent. 

simplicial notation: notation for a tree or other simple graph in which each edge is 
specified by its endpoints and each path is specified by its vertex sequence. 

spanning tree (of a connected graph): a tree that contains all the vertices of the 
graph. 

subtree: a subgraph of a tree that is also a tree. 
terminal vertex (of a tree): a vertex of degree 1 . 
tree: a connected graph with no cycles. 

tree edge: for a graph G with a spanning tree T, an edge e of G such that e £ T . 
tree traversal: a walk that visits all the vertices of a tree. 
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9.1 CHARACTERIZATIONS AND TYPES OF TREES 


9.1 .1 PROPERTIES OF TREES 

For trees, as with other graphs, there is a wide variety of terminology in use from one 
application or specialty to another. 

Definitions: 

A graph is acyclic is it contains no subgraph isomorphic to a cycle C n (§8.1.3). 

A forest is an acyclic graph. 

A tree is an acyclic connected graph. (Note: Unless stated otherwise, all trees are 
assumed to be finite, i.e., to have a finite number of vertices.) 

The eccentricity of a vertex is the length of the longest simple path beginning at that 
vertex. 

A center of a tree T is a vertex v with minimum eccentricity. 

An end vertex of a tree is a vertex of degree 1. 

A caterpillar is a tree that contains a path such that every edge has one or both 
endpoints in that path. 


Facts: 

1. A (finite) tree with at least two vertices has at least two end vertices. 

2 . A connected graph with n vertices is a tree if and only if has exactly n — 1 edges. 

3. A graph is a tree if and only if there is a unique simple path between any two 
vertices. 

4. A graph is a forest if and only if every edge is a cut-edge (§8.3.3). 

5. Trees are bipartite. Hence, every tree can be colored using two colors. 

6. The center of a tree consists of either only one vertex or two adjacent vertices. 


Examples: 

1. A tree: 


2. A forest: 

A 
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3. A tree with two adjacent vertices a and b in its center: 



4. A caterpillar: 



5. Neither of the graphs shown is a tree. One contains a 3-cycle, and the other contains 
a 1-cycle (i.e., a self-loop). 



9.1 .2 ROOTS AND ORDERINGS 

Adding some extra structure to trees adapts them to applications in many disciplines, 
especially computer science. 

Definitions: 

A rooted tree ( T,r ) is a tree T with a distinguished vertex r (the root), in which all 
edges are implicitly directed away from the root. 

Two rooted trees (Tj,ri) and ( T 2 ,r 2 ) are isomorphic as rooted trees if there is an 
isomorphism /:Tj — > T 2 (§8.1.2) that takes n to r 2 . 

A child of a vertex v in a rooted tree is a vertex that is the immediate successor of v 
on a path from the root. 

A descendant of a vertex v in a rooted tree is v itself or any vertex that is a successor 
of v on a path from the root. 

A proper descendant of a vertex v in a rooted tree is any descendant except v itself. 

The parent of a vertex v in a rooted tree is a vertex that is the immediate predecessor 
of v on a path to v from the root. 

The parent function of a rooted tree T maps the root of T to the empty set and maps 
every other vertex to its parent. 

An ancestor of a vertex v in a rooted tree is v itself or any vertex that is the predecessor 
of v on a path to v from the root. 

A proper ancestor of a vertex v in a rooted tree is any ancestor except v itself. 
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Siblings in a rooted tree are vertices with the same parent. 

An internal vertex in a rooted tree is a vertex with children. 

A leaf in a rooted tree is a vertex that has no children. 

The depth of a vertex in a rooted tree is the number of edges in the unique path from 
the root to that vertex. 

The nth level in a rooted tree is the set of all vertices at depth n. 

The height of a rooted tree is the maximum depth of any vertex. 

An ordered tree is a rooted tree in which the children of each internal vertex are 
linearly ordered. 

A left sibling of a vertex v in an ordered tree is a sibling that precedes v in the ordering 
of v and its siblings. 

A right sibling of a vertex v in an ordered tree is a sibling that follows v in the ordering 
of v and its siblings. 

A plane tree is a drawing of an ordered tree such that the left-to-right order of the 
children of each node in the drawing is consistent with the linear ordering of the corre- 
sponding vertices in the tree. 

In the level ordering of the vertices of an ordered tree, it precedes v under any of 
these circumstances: 

• if the depth of u is less than the depth of v; 

• if u is a left sibling of v\ 

• if the parent of u precedes the parent of v. 

Two ordered trees (Ti,ri) and are isomorphic as ordered trees if there is a 

rooted tree isomorphism /: r I\ — > Z 2 that preserves the ordering at every vertex. 

An m-ary tree is a rooted tree such that every internal vertex has at most m children. 

A full m-ary tree is a rooted tree such that every internal vertex has exactly m 
children. 

A (pure) binary tree is a rooted tree such that every internal vertex has at most two 
children. This meaning of “binary tree” occurs commonly in pure graph theory. 

A binary tree is a 2-ary tree such that every child, even an only child, is distinguished 
as left child or right child. This meaning of “binary tree” occurs commonly in 
computer science and in permutation groups. 

The principal subtree at a vertex v of a rooted tree comprises all descendants of v 
and all edges incident to these descendants. It has v designated as its root. 

The left subtree of a vertex v in a binary tree is the principal subtree at the left child. 
The right subtree of v is the principal subtree at the right child. 

A balanced tree of height ft, is a rooted m-ary tree in which all leaves are of height ft 
or ft— 1. 

A complete binary tree is a binary tree in which every parent has two children and 
all leaves are at the same depth. 

A complete m-ary tree is an m-ary tree in which every parent has two children and 
all leaves are at the same depth. 
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Algorithm 1: Find a Huffman code. 

input: the probabilities Pr(xi ), . . . , Pr(x n ) on a set X 
output: a Huffman code for (X,Pr) 

initialize F to be a forest of isolated vertices, labeled xi, . . . , x n , each to be 
regarded as a rooted tree 

assign weight Pr(xj) to the rooted tree x v for j = 1, . . . , n 
repeat until forest F is a single tree 
choose two rooted trees, T and T', of smallest weights in forest F 
replace trees T and T' in forest F by a tree with a new root whose left subtree 
is T and whose right subtree is T' 
label the new edge to T with a 0 and the new edge to T' with a 1 
assign weight w(T) + w(T') to the new tree 
return tree F 

{The Huffman code word for x^ is the concatenation of the labels on the unique 
path from the root to Xi.} 


A decision tree a rooted tree in which every internal vertex represents a decision and 
each path from the root to a leaf represents a cumulative choice. 

A prefix code for a finite set X = {xi, . . . , x n } is a set {ci, . . . , c„} of binary strings 
in X (called codewords) such that no codeword is a prefix of any other codeword. 

A Huffman code for a set X with a probability measure Pr (see §7.1) is a prefix 

n 

code {ci, . . . , c n } such that len(cj)Pr(xj) is minimum among all prefix codes, where 

j= i 

len(cj) measures the length of Cj in bits. 

A Huffman tree for a set X with a probability measure Pr is a tree constructed by 
Huffman’s algorithm to produce a Huffman code for (X, Pr). 


Facts: 


1. Plane trees are usually drawn so that vertices of the same level in the corresponding 
ordered tree are represented at the same vertical position in the plane. 


2. A rooted tree can be represented by its vertex set plus its parent function. 

3. The concept of finite binary tree also has the following recursive definition: (basis 
clause) an ordered tree with only one vertex is a binary tree; (recursion clause) an 
ordered tree with more than one vertex is a binary tree if the root has two children and 
if both its principal subtrees are binary trees. 


4. 

5. 

6 . 


A full 777-ary tree with k internal vertices has mk + 1 vertices and (m — l)/c + l leaves. 

k — 1 . . (in — l)k + 1 

A full m-ary tree with k vertices has internal vertices and leaves. 

777 . 777 

There are at most m h leaves in any 777-ary tree of height h. 


7. A binary search tree is a special kind of binary tree used to implement a random 
access table with O(n) maintenance and retrieval algorithms. (See Chapter 17.) 

8. A balanced binary tree can be used to implement a priority queue with O(n) enqueue 
and dequeue algorithms. (See §17.2.4.) 

9. Algorithm 1, due to D. Huffman in 1951, constructs a Huffman tree. 
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Examples: 

1. A rooted tree ( T,r ): 


r 



2. A rooted tree and its parent function: 

vertex a b c d e f g 

parent d d d 0 c b c 

b 
f 


3. A 2-ary tree of height 4: 



4. A balanced binary tree: 




5. The following tree is rooted at vertex r. Vertices d and e are children of vertex b. 
Vertex / is a descendant of /, d, b, and r, but / is not a descendant of vertex a. Vertex a 
is the parent of c, which is the only proper descendant of vertex a. Vertices d and e are 
siblings, but c is not a sibling of d or of e. 
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6. The leaves of the following rooted tree are the vertices c, d, /, g , and h. The internal 
vertices are a, b, e, and s. 



7. The following two rooted trees are isomorphic as graphs, but they are considered 
to be different as rooted trees, because there is no graph isomorphism from one to the 
other that maps root to root. 

A A\ 

8. The following two plane trees are isomorphic as rooted trees, but they are not 
isomorphic as ordered rooted trees, because there is no rooted tree isomorphism from 
one to the other that preserves the child ordering at every vertex. 

A A 

• o 

9. A complete binary tree of height 2. 



10. A complete 3-ary tree of height 2. 



11. The iterative construction of a Huffman tree for the set X = {u, v, w, x, y, z} with 
respective probabilities {0.08,0.10,0.12,0.15,0.20,0.35} would proceed as follows: 


forest 0 

.08 .10 

• • 

a b 

.12 

• 

c 

.15 

• 

d 

.20 

• 

e 

.35 

• 

f 

forest 1 

.18 

A 

a b 

.12 

• 

c 

.15 

• 

d 

.20 

• 

e 

.35 

• 

f 

forest 2 

.18 

.27 

.20 

.35 

A 

A 

• 

e 

• 

f 


abed 
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forest 3 


forest 4 


forest 5 



The codes are 000 for a, 001 for b, 100 for c, 101 for d , 01 for e, and 11 for /. Thus, the 
most frequently used objects in the set are represented by the shortest binary codes. 


9.1.3 TREE TRAVERSAL 

Ordered rooted trees can be used to store data or arithmetic expressions involving 
numbers, variables and operations. A tree traversal algorithm gives a systematic method 
for accessing the information stored in the tree. 

Definitions: 

A boundary walk of a plane tree is a walk around the boundary of the single region 
of the given plane imbedding of the tree, starting at the root. 

A backtrack along a walk in a graph is an instance . . . , u, e, v, e, u, . . . of two consecutive 
edge-steps in which an edge-step traverses the same edge as its predecessor, but in the 
opposite direction. 

A reduced walk is a walk without backtracking. 

A preorder traversal of an ordered rooted tree T lists the vertices of T (or their 
labels) so that each vertex v is followed by all the vertices, in preorder, in its principal 
subtrees, respecting their left-to-right order. 

A postorder traversal of an ordered rooted tree T lists the vertices of T (or their 
labels) so that each vertex v is preceded by all the vertices, in postorder, in its principal 
subtrees, respecting their left-to-right order. 

An inorder traversal of an ordered rooted tree T lists the vertices of T (or their 
labels) so that each vertex v is preceded by all the vertices, in inorder, in its first 
principal subtree and so that v is followed by the vertices, in inorder, of its other 
principal subtrees, respecting their left-to-right order. 
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Algorithm 2: Parent-finder for the postorder of a plane tree. 

input: the postorder v p (i), . . . , u p ( n ) of a plane tree with sorted vertex labels and 
a vertex Vj 

output: the parent of Vj 

scan the postorder until Vj is encountered 

continue scanning until some vertex Vi is encountered such that i < j 
return (u,;) 


Prefix (or Polish) notation is the form of an arithmetic expression obtained from a 
preorder traversal of a binary tree representing this expression. 

Postfix (or reverse Polish ) notation is the form of an arithmetic expression obtained 
from a postorder traversal of a binary tree representing this expression. 

Infix notation is the form of an arithmetic expression obtained from an inorder traver- 
sal of a binary tree representing this expression. A left parenthesis is written immedi- 
ately before writing the left principal subtree of each vertex, and a right parenthesis is 
written immediately after writing the right principal subtree. 

The universal address system of an ordered rooted tree is a labeling in which the 
root is labeled 0 and in which for each vertex with label x, its m children are labeled 
x.l, x.2 , . . . , x.m, from left to right. 

In the level order of the vertices of an ordered tree T, vertex u precedes vertex v if u 
is nearer the root, or if u and v are at the same level and u and v have ancestors u' 
and v' that are siblings and v! precedes v' in the ordering of T. 

A bijective assignment of labels from an ordered set (such as alphabetic strings or the 
integers) to the vertices of an ordered tree is sorted if the level order of these labels is 
either ascending or descending. 

Facts: 

1. The preorder traversal of a plane tree is obtained by a counterclockwise traversal of 
the boundary walk of the plane region, that is, starting downward toward the left. As 
each vertex of the tree is encountered for the first time along this walk, it is recorded 
in the preorder. 

2. The postorder traversal of a plane tree is obtained by a counterclockwise traversal 
of the boundary walk of the plane region, that is, starting downward toward the left. 
As each vertex of the tree is encountered for the last time along this walk, it is recorded 
in the postorder. 

3. The inorder traversal of a plane tree is obtained by a counterclockwise traversal of 
the boundary walk of the plane region, that is, starting downward toward the left. As 
each interior vertex of the tree is encountered for the second time along this walk, it is 
recorded in the inorder. An end vertex is recorded whenever it is encountered for the 
only time. 

4. Two nonisomorphic ordered trees with sorted vertex labels can have the same pre- 
order but not the same postorder. 


© 2000 by CRC Press LLC 




Examples: 

1 . A plane tree with pre-order abe f hcdgij k, post-order eh f bcij k g da, and in- 
order ebhfacigjkd. 



2 . A binary tree representing the arithmetic expression (x + y) / (x — 2), with infix form 
x + y / x — 2, prefix form / + x y — x 2, and postfix form x y + x 2 — /. 


/ 



9.1.4 INFINITE TREES 
Definitions: 

An infinite tree is a tree with an infinite number of vertices or edges. 

The diameter of a tree is the maximum distance between two distinct vertices in the 
tree. 

A bounded tree is a tree of finite diameter. 

A locally finite tree is a tree in which the degree of every vertex is finite. 

A homogeneous tree is a tree in which every vertex has the same degree. 

An n-homogeneous tree is a tree in which every vertex has degree n. 

A bihomogeneous tree is a nonhomogeneous tree with a partition of the vertices into 
two subsets, such that all vertices in the same subset have the same degree. 

A semi-homogeneous tree is a bihomogeneous tree such that each vertex of one of 
the two realized degrees is adjacent to a vertex of the other realized degree. 

Examples: 

1. Suppose that two finite bitstrings are considered adjacent if one bitstring can be 
obtained from the other by appending a 0 or a 1 at the right. The resulting graph is 
the infinite bihomogeneous tree, in which the empty string A has degree 2 and all other 
finite bitstrings have degree 3. 
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2. Consider the set of all finite strings on the alphabet {a, a -1 , 6, b~ 1 } containing no 
instances of the substrings aa~ x a~ 1 a, bb~ x , or b~ 1 b. Suppose that two such strings are 
considered to be adjacent if and only if one of them can be obtained from the other 
by appending one of the alphabet symbols at the right. Then the resulting graph is a 
4- homogeneous tree. 

3. Consider as vertices the set of infinite bitstrings with at most two Is. Suppose two 
such bitstrings are regarded as adjacent if they differ in only one bit, and that bit is a 
rightmost 1 for one of the two bitstrings. This graph is a bounded tree of diameter four. 


9.2 SPANNING TREES 

A spanning tree of a graph G is a subgraph of G that is a tree and contains every 
vertex of G. Spanning trees are very useful in searching the vertices of a graph and in 
communicating from any given node to the other nodes. Minimum spanning trees are 
covered in §10.1. 


9.2.1 DEPTH-FIRST AND BREADTH-FIRST SPANNING TREES 
Definitions: 

A spanning tree of a graph G is a tree that is a subgraph of G and that contains every 
vertex of G. 

A tree edge of a graph G with a spanning tree T is an edge e such that e £ T. 

A chord of a graph G with a spanning tree T is an edge e such that e £T. 

A back edge of a digraph G with a spanning tree T is a chord e that joins one of its 
endpoints to an ancestor in T. 

A forward edge of a digraph G with a spanning tree T is a chord e that joins one of 
its endpoints to a descendent in T . 

A cross edge of a digraph G with a spanning tree T is a chord e that is neither a back 
edge nor a forward edge. 

The fundamental cycle of a chord e with respect to a given spanning tree T of a 
graph G consists of the edge e and the unique path in T joining the endpoints of e. 

A depth-first search ( DFS ) of a graph G is a way to traverse every vertex of a 
connected graph by constructing a spanning tree, rooted at a given vertex r. Each 
stage of the DFS traversal seeks to move to an unvisited neighbor of the most recently 
visited vertex, and backtracks only if there is none available. See Algorithm 1. 

A depth-first-search tree is the spanning tree constructed during a depth-first search. 

Backtracking during a depth-first search means retreating from a vertex with no 
unvisited neighbors back to its parent in the dfs-tree. 

A breadth-first search ( BFS ) of a graph G is a way to traverse every vertex of a 
connected graph by constructing a spanning tree, rooted at a given vertex r. After 
the BFS traversal visits a vertex v, all of the previously unvisited neighbors of v are 
enqueued, and then the traversal removes from the queue whatever vertex is at the front 
of the queue, and visits that vertex. See Algorithm 2. 


© 2000 by CRC Press LLC 



Algorithm 1: Depth-first search spanning tree. 

input: a connected locally ordered n - vertex graph G and a starting vertex r 
output: the edgeset Et of a spanning tree and an array X[l..n] listing Vq in 
DFS-order 

initialize all vertices as unvisited and all edges as unused 

Et := 0; loc := 1 

dfs(r) 

procedure dfs(u ) 

mark u as visited 
X[Zoc] := u 
loc := loc + 1 

while vertex u has any unused edges 
e := next unused edge at u 
mark e as used 

w := the other endpoint of edge e 
if w is unvisited then 
add e to Et 
dfs(w) 


A breadth-first-search tree is the spanning tree constructed during a breadth-first 
search. 

The fundamental cycle of a connected graph G associated with a spanning tree T 
and an edge e € Eg not in T is the unique cycle created by adding the edge e to the 
tree T. 

The fundamental system of cycles of a connected graph G associated with a span- 
ning tree T is the set of fundamental cycles corresponding to the various edges of G — T. 

Given two vertex sets X\ and X 2 that partition the vertex set of a graph G, the 
partition-cut (Aj , X 2 ) is the set of edges of G that have one endpoint in X\ and 
the other in X 2 . 

The fundamental edge-cut of a connected graph G associated with removal of an 
edge e from a spanning tree T is the partition-cut (Xi,X 2 ) where X\ and X 2 are the 
vertex-sets of the two components of T — e. 

The fundamental system of edge-cuts of a connected graph G associated with a 
spanning tree T is the set of fundamental edge-cuts that result from removal of an edge 
from the tree T. 

Facts: 

1. Every connected graph has at least one spanning tree. 

2. A connected graph G has k edge-disjoint spanning trees if and only if for every 
partition of Vq into m nonempty subsets, there are at least k(m — 1) edges connecting 
vertices in different subsets. 

3. Let T and T' be spanning trees of a graph G and e £ T — T' . Then there exists an 
edge e' € T' — T such that both T — e U |e'} and T' — e! U {e} are spanning trees of G. 
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Algorithm 2: Breadth-first search spanning tree. 

input: a connected locally n- vertex ordered graph G and a starting vertex r. 
output: the edgeset Et of a spanning tree and an array X[l..n] listing Vq in 
BFS-order 

initialize all vertices as unvisited and all edges as unused 
Et := 0; loc := 1; Q := r {Q is a queue} 
while Q yf 0 
x := front(Q) 
remove x from Q 
bf s(r) 

procedure: bfs(u) 
mark u as visited 
X[7oc] := u 
loc := loc + 1 

while vertex u has any unused edges 
e := next unused edge at u 
mark e as used 

w := the other endpoint of edge e 
if w is unvisited then 
add e to Et 
add w to the end of Q 


4. In the column vector space of the incidence matrix of G over GF( 2), every edge set 
can be represented as a sum of column vectors. Let T be a spanning tree of G. Then 
each cycle C can be written in a unique way as a linear combination of the fundamental 
cycles of whatever chords of T occur in C. 

5. Depth- first search on an n- vertex, m-edge graph runs in 0(m) time. 

6. DFS-trees are used to find the components, cutpoints, blocks, and cut-edges of a 
graph. 

7. The unique path in the BFS-tree T of a graph G from its root r to a vertex v is a 
shortest path in G from r to v. 

8. Breadth-first search on an ?z-vertex, m-edge graph runs in 0(m ) time. 

9. A BFS-tree in a simple graph has no back edges. 

10. Dijkstra’s algorithm (§10.3.2) constructs a spanning tree T in an edge-weighted 
graph such that for each vertex v, the unique path in T from a specified root r to v 
is a minimum-cost path in the graph from r to v. When all edges have unit weights, 
Dijkstra’s algorithm produces the BFS tree. 

1 1 . The level order of the vertices of an ordered tree is the order in which they would 
be traversed in a breadth- first search of the tree. 

12. The fundamental cycle of an edge e with respect to a spanning tree T such that 
e ^ T consists of edge e and those edges of T whose fundamental edge-cuts contain e. 

13. The fundamental edge-cut with respect to removal of edge e from a spanning tree T 
consists of edge e and those edges of Eq — Et whose fundamental cycles contain e. 
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Examples: 

1. Consider the following graph and spanning tree and a digraph on the same vertex 
and edge set. The tree edges are a, b, c, e, /, h, k, l, and the chords are d, g , i,j. Chord d 
is a forward edge, chord i is a back edge, and chords g and j are cross edges. 



2. In the graph of Example 1, the fundamental cycles of the chords d, g , i, and j 
are {d, b, e}, {g, /, c, a, b, e}, { i , h, ?}, and {j, /, h, ^}, respectively. The non-fundamental 
cycle {a,d, g,c, /} is the sum (mod 2) of the fundamental cycles of chords d and g. 

3. A spanning tree and its fundamental system of cycles. 



4. A spanning tree and its fundamental system of edge-cuts. 



5. Suppose for the graph of Example 1, that the local order of adjacencies at each 
vertex is the alphabetic order of the edge labels. Then the construction of the DFS-tree 
is as follows: 
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6. Suppose for the graph of Example 1, that the local order of adjacencies at each 
vertex is the alphabetic order of the edge labels. Then the construction of the BFS-tree 
is as follows: 



9.2.2 ENUMERATION OF SPANNING TREES 
Definitions: 

The number of spanning trees t(G) of a graph G counts two spanning trees Tj 
and T 2 as different if their edgesets are different, even if there is an automorphism of G 
mapping Tj onto T 2 . 

The degree matrix D(G) of an n- vertex graph G whose vertex degree sequence 
di,...,d n is the n x n diagonal matrix in which the elements of the main diagonal 
are the degrees d%, , d n (and the off-diagonal elements are Os). 

Facts: 

1. Cayley’s formula : r(K n ) = n n ~ 2 , where K n is the complete graph. 

2. T{K m ^ n ) = m n ~ 1 n m ~ 1 , where /v m ,n is the complete bipartite graph. 

3. t(I s + K n _ s ) = n n ~ 2 (l - ^) s , where I s is the edgeless graph on n vertices 

and “+” denotes the join (§8.1.2). 

4. T(W n ) = ^ 3+ 2 v ^ ^ + ( 3 ~ 2 V ^ ) — 2, where W n denotes the wheel with n rim vertices. 

5. Matrix-tree theorem : For each s and t , r(G) equals (— l) s+t times the determinant 
of the matrix obtained by deleting row s and column t from D{G) — A(G), where A(G) 
is the adjacency matrix for G. 

6. For each edge e of a graph G, r(G) = r(G — e) + r(G/e), where e” denotes edge 
deletion and “/e” denotes edge contraction. 

7. The number of spanning trees of K n with degrees d \, . . . , d n is ( dl _ 1 " _2 d _ x ) (§2.3.2). 
In this formula, the vertices are distinguishable (labeled) and are given their degrees in 
advance, and the only question is how to realize them with edges. 
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Examples: 

1. t(K 3 ) = 3 3-2 = 3. Each of the three spanning trees is a path on two edges, as 
illustrated below. Also, t{K±) = 4 4-2 = 16. 

AAA 

2. r(A' 2 , n ) = n2 n h To confirm this, let X = {x\, a; 2 } and |F| = n. The spanning tree 
contains a path of length 2 joining x\ to X 2 , whose middle vertex in Y can be chosen 
in n ways. For each of the remaining n — 1 vertices of Y, there is a choice as to which 
of x\ and X 2 is its neighbor (not both, since that would create a cycle). 

3 . t (/ 3 + A© = 5 3 ( 1-§) 2 = 20 . 

4 . t (W 4 ) = ( 5i ^ I ) 4 + ( 5 ^ I ) 4 -2 = 45 . 

5. To illustrate the matrix-tree theorem, consider the following graph G. 


Then 



D(G) - A(G) 


( 3 

-1 

-1 

-1 

-1 

2 

-1 

0 

-1 

-1 

3 

-1 

V-1 

0 

-1 

2 


Deleting row 2 and column 3, for example, yields 

3 -1 

r(G) = (-1) 2 + 3 ' 


The 8 spanning trees of G are: 


-1 


1 -1 
0 


-1 

-1 

2 





6. The recursive formula r(G) = r(G — e) + r(G/e) is illustrated with the same graph G 
of the previous example and with e = Viv 3 . In the computation G is drawn instead of 
writing r(G), and similarly with the other graphs. This yields 



1 T T 2 

3I 14 



2 
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7 . Let the vertices of K$ be vq, v±, v%, V3, V4. The number of spanning trees of R 5 in 
which the degrees of vo,v\,V 2 , 1 ) 3,114 are 3, 2, 1,1,1, respectively, is given by the multi- 
nomial coefficient ( 3 _ 1 2 _ 1 1 _ 1 1 _ 1 ) = 2 ! i! o!-o! o! = 2 -i-i-m = 3- The three trees in 

question are: 



9.3 ENUMERATING TREES 

Tree counting began with Arthur Cayley in 1889, who wanted to enumerate the satu- 
rated hydrocarbons. George Polya developed an extensive theory in 1937 for counting 
families of organic chemicals, which was used by Richard Otter in 1948 in his solution 
of the specific problem of counting saturated hydrocarbons. Tree counting formulas are 
used in computer science to estimate running times in the design of algorithms. 


9.3.1 COUNTING GENERIC TREES 

Definitions: 

A tree is a connected graph with no cycles. Two trees are considered the “same”, for 
counting purposes, if they are isomorphic. 

A labeled tree is a tree in which distinct labels such as V\,V 2 , ■ ■ ■ ,v n have been assigned 
to the vertices. Two labeled trees with the same set of labels are considered the same 
only if there is an isomorphism from one tree to the other such that each vertex is 
mapped to the vertex with the same label. 

A rooted tree is a tree in which one vertex, the root, is distinguished. Two rooted 
trees are considered the same if there is an isomorphism from one to the other that 
maps the root of the first to the root of the second. 

A reduced tree, sometimes called a homeomorphically reduced or series reduced tree, 
is a tree with no vertices of degree 2 . 

Facts: 

1. Cayley’s formula: The number of labeled trees with n vertices equals n n ~ 2 . See 
Table 1. 

2. The number of rooted labeled trees with n vertices equals ?z n_1 . See Table 1. 

3. Rooted trees and most other tree structures can be counted by using generating 
functions. 

4. The generating function r(x) for the number R n of rooted trees with n vertices (see 

OO 

Table 2) is r(x) = J2 RnX n = x + x 2 + 2x 3 + 4x 4 + 9x 5 + 20a ; 6 + • • • . 

n= 1 

5. The coefficients R n of the generating function r{ x) for rooted trees can be determined 
from the recurrence relation 
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Table 1 Labeled trees and rooted labeled trees with n vertices. 


n 

labeled trees 

rooted labeled trees 

i 

1 

1 

2 

1 

2 

3 

3 

9 

4 

16 

64 

5 

125 

625 

6 

1,296 

7,776 

7 

16,807 

117,649 

8 

262,144 

2,097,152 

9 

4,782,969 

43,046,721 

10 

100,000,000 

1,000,000,000 

11 

2,357,947,691 

25,937,424,601 

12 

61,917,364,224 

743,008,370,688 

13 

1,792,160,394,037 

23,298,085,122,481 

14 

56,693,912,375,296 

793,714,773,254,144 

15 

1,946,195,068,359,375 

29,192,926,025,390,625 

16 

72,057,594,037,927,936 

1,152,921,504,606,846,980 


r{x) = x n (1 — a-’ 1 ) Ri ■ 

i = 1 

An alternative defining expression for this generating function is 

r{x) = a: exp (jZ 

OO 

6 . The generating function t(x) = YZ T n ■ x n = x + x 2 + x 3 + 2x 4 + 3ar 5 + 6x 6 + • • • 

n = 1 

for counting trees (see Table 2) is obtained from that for rooted trees by using Otter’s 
formula 

t(x) = r(x) — \ (r(x) 2 — r(x 2 )). 


7 . The generating function h(x) = YZ H n - x n = x + x 2 + x 4 + x 5 + 2x 6 + 2x 7 + -ix 8 + ■ ■ ■ 

n = 1 

for counting reduced trees (see Table 2) is based on another function f(x) determined 
by the equation 

OO / 00 c, \ 

/(^) = T^n(l-^)- F ‘ = Tfyexp • 

i= 1 \i= 1 J 

Then 

h(x) = (1 + x)f(x) - ^f(x) 2 + ^ fix 2 ). 

Note that there are no reduced trees with exactly 3 vertices. 


Examples: 

1. There are exactly three trees with five vertices: 
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Table 2 Rooted trees, trees, and reduced trees with n vertices. 


n 

R n (rooted trees ) 

T n (trees) 

H n (reduced trees) 

i 

1 

1 

1 

2 

1 

1 

1 

3 

2 

1 

0 

4 

4 

2 

1 

5 

9 

3 

1 

6 

20 

6 

2 

7 

48 

11 

2 

8 

115 

23 

4 

9 

286 

47 

5 

10 

719 

106 

10 

11 

1,842 

235 

14 

12 

4,766 

551 

26 

13 

12,486 

1,301 

42 

14 

32,973 

3,159 

78 

15 

87,811 

7,741 

132 

16 

235,381 

19,320 

249 

17 

634,847 

48,629 

445 

18 

1,721,159 

123,867 

842 

19 

4,688,676 

317,955 

1,561 

20 

12,826,228 

823,065 

2,988 

21 

35,221,832 

2,144,505 

5,671 

22 

97,055,181 

5,623,756 

10,981 

23 

268,282,855 

14,828,074 

21,209 

24 

743,724,984 

39,299,897 

41,472 

25 

2,067,174,645 

104,636,890 

81,181 

26 

5,759,636,510 

279,793,450 

160,176 

27 

16,083,734,329 

751,065,460 

316,749 

28 

45,007,066,269 

2,023,443,032 

629,933 

29 

126,186,554,308 

5,469,566,585 

1,256,070 

30 

354,426,847,597 

14,830,871,802 

2,515,169 

31 

997,171,512,998 

40,330,829,030 

5,049,816 

32 

2,809,934,352,700 

109,972,410,221 

10,172,638 

33 

7,929,819,784,355 

300,628,862,480 

20,543,579 

34 

22,409,533,673,568 

823,779,631,721 

41,602,425 

35 

63,411,730,258,053 

2,262,366,343,746 

84,440,886 

36 

179,655,930,440,464 

6,226,306,037,178 

171 794,492 

37 

509,588,049,810,620 

17,169,677,490,714 

350,238,175 

38 

1,447,023,384,581,029 

47,436,313,524,262 

715,497,037 

39 

4,113,254,119,923,150 

131,290,543,779,126 

1,464,407,113 

40 

11,703,780,079,612,453 

363,990,257,783,343 

3,002,638,286 


2. The first two trees in the figure of Example 1 can each be labeled in 60 essentially 
different ways, while the third tree can only be labeled in 5 essentially different ways. 
Thus, there are 125 different labeled trees with 5 vertices. 
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3. The first tree in the figure of Example 1 can be rooted in 3 essentially different ways, 
and thus corresponds to 3 different rooted trees. The second and third trees in that 
figure represent 4 and 2 different rooted trees, respectively. Thus there are 9 different 
rooted trees with 5 vertices. 

4. The third tree in the figure of Example 1 is the only reduced tree with 5 vertices. 


9.3.2 COUNTING TREES IN CHEMISTRY 


Definitions: 

A 1-4 tree is a tree in which each vertex has degree 1 or 4. 

A 1-rooted 1-4 tree is a 1-4 tree rooted at a vertex of degree 1. 

Facts: 

1. Saturated hydrocarbons, also called alkanes, are compounds with the chemical for- 
mula C n H 2 n+ 2 ; they consist of n carbon atoms of valence 4 and 2n + 2 hydrogen atoms 
of valence 1. The molecular structure of alkanes is modeled by the 1-4 trees. 

Note: It is convenient when counting alkanes to include the hydrogen molecule H 2 , 
which has no carbon atoms and 2 hydrogen atoms, as an honorary alkane. 

2. A monosubstituted hydrocarbon has n carbon atom, 2n + 1 hydrogen atoms, and 
an OH group. They have the chemical formula C n H 2 n +iOH; they include the familiar 
alcohols. 

Note: It is convenient when counting alcohols to include the water molecule HOH as 
an honorary alcohol. 

3. The number A n (see Table 3) of 1-rooted 1-4 trees (alcohols) with n 4-valent vertices 
(carbon atoms), 2n+l non-root 1-valent vertices (hydrogen atoms), and a 1-valent root 
(the OH group) has the generating function 

OO 

a (x ) = £ A n x n = 1 + x + x 2 + 2x 3 + 4a; 4 + 8a; 5 + 17a; 6 + • • • 

n — 0 

whose coefficients can be determined from the recurrence relation 
a(x) = 1+| (a(a;) 3 + 3a(x)a(x 2 ) + 2a(x 3 )) 

4. In counting unrooted 1-4 trees, a preliminary step is to count the number G n of 1-4 
trees rooted at a vertex of degree 4. The coefficients of the corresponding generating 
function 

OO 

g{x) = £ G n x n = x + x 2 + 2x 3 + 4a; 4 + 9a: 5 + 18a; 6 + • • • 

71=1 

are determined by the equation 

g(x) = (a(a;) 4 + 6a(a;) 2 a(a; 2 ) + 8a(x)a(x 3 ) + 3a(x 2 ) 2 + 6a(a; 4 )). 

5. The number B n (see Table 3) of 1-4 trees (alkanes) with n 4-valent vertices (carbon 
atoms) and 2n + 2 1-valent vertices (hydrogen atoms) has the generating function 

OO 

b(x) = £ B n ■ x n = 1 + x + x 2 + x 3 + 2a; 4 + 3a; 5 + 5a; 6 + • • • 

71=0 


© 2000 by CRC Press LLC 


Table 3 1-Rooted 1-4 trees and 1-4 trees with n vertices of degree 4. 


n 

A n : 1-rooted 1-4 trees 
(alcohols) 

B n : 1-4 trees 
(alkanes) 

i 

1 

1 

2 

1 

1 

3 

2 

1 

4 

4 

2 

5 

8 

3 

6 

17 

5 

7 

39 

9 

8 

89 

18 

9 

211 

35 

10 

507 

75 

11 

1,238 

159 

12 

3,057 

355 

13 

7,639 

802 

14 

19,241 

1,858 

15 

48,865 

4,347 

16 

124,906 

10,359 

17 

321,198 

24,894 

18 

830,219 

60,523 

19 

2,156,010 

148,284 

20 

5,622,109 

366,319 

21 

14,715,813 

910,726 

22 

38,649,152 

2,278,658 

23 

101,821,927 

5,731,580 

24 

269,010,485 

14,490,245 

25 

712,566,567 

36,797,588 

26 

1,891,993,344 

93,839,412 

27 

5,034,704,828 

240,215,803 

28 

13,425,117,806 

617,105,614 

29 

35,866,550,869 

1,590,507,121 

30 

95,991,365,288 

4,111,846,763 

31 

257,332,864,506 

10,660,307,791 

32 

690,928,354,105 

27,711,253,769 

33 

1,857,821,351,559 

72,214,088,660 

34 

5,002,305,607,153 

188,626,236,139 

35 

13,486,440,075,669 

493,782,952,902 

36 

36,404,382,430,278 

1,295,297,588,128 

37 

98,380,779,170,283 

3,404,490,780,161 

38 

266,158,552,000,477 

8,964,747,474,595 

39 

720,807,976,831,447 

23,647,478,933,969 

40 

1,954,002,050,661,819 

62,481,801,147,341 


which can be determined from the equation 

b(x) = g(s) + a(x) — \ ( a(x ) 2 — a(x 2 )). 
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Examples: 

1. The three different 1-4 trees with 5 vertices of degree 4 are: 


H H H H H 



H H H H H 


H C H 


H H | H 

h — d — d — c — d — h 

i i i i 



H— C H 


H 


2 . The first 1-4 tree in the figure of Example 1 can be rooted at a vertex of degree 1 
in 3 essentially different way, the second in 4 essentially different ways, and the third in 
only 1 essential way. Thus there are 8 different 1-rooted 1-4 trees with 5 vertices. 


9.3.3 COUNTING TREES IN COMPUTER SCIENCE 


Definitions: 

A binary tree is a rooted tree in which each vertex has at most two children, and such 
that each child is designated either a left child or a right child. An only child may 
be either a left child or a right child. 

A left-right tree is a binary tree in which each vertex is a parent either to no children 
or to both a left child and a right child. 

An ordered tree is a tree in which the children of every vertex are linearly ordered. 

Facts: 

1. Binary trees are counted by the Catalan numbers C n (§3.1.3): the number of binary 
trees with n vertices is C n . 

2. Each principal subtree of a left-right tree is a left-right tree. 

3 . Left-right trees are frequently used to represent arithmetic expressions, in which 
the leaves of the tree correspond to numbers and the other vertices represent binary 
operations such as + , — , x , or 4- . 

4 . There is an obvious one-to-one correspondence between binary trees with n vertices 
and left-right trees with 2n+ 1 vertices: delete all the leaves of a left-right tree to obtain 
a binary tree. 

5 . The number of left-right trees with n internal vertices and n + 1 leaves is also C n . 
This follows from Fact 4. 

6. Ordered trees can represent structures such as family trees, showing all descendants 
of the person represented by the root. The children of each person in the tree would be 
represented as children of the corresponding vertex, ordered according to birth date. 

7. The number of ordered trees with n vertices is C n _ i. 
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Examples: 

1. The 5 binary trees with 3 vertices: 

/< A >\ 

2. The 5 left-right trees with 7 vertices: 
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INTRODUCTION 


The vertices and edges of a graph often have quantitative information associated with 
them, such as supplies and demands (for vertices), and distance, length, capacity, and 
cost (for edges). Relative to such networks, a number of discrete optimization problems 
arise in a variety of disciplines: statistics, electrical engineering, operations research, 
combinatorics, and computer science. Typical applications include designing least cost 
telecommunication systems, maximizing throughput in a manufacturing system, finding 
a minimum cost route or set of routes for delivery vehicles, and distributing electricity 
from a set of supply points to meet customer demands at minimum cost. In this chapter, 
a number of classical network optimization problems are studied and algorithms are 
described for their exact or approximate solution. 


GLOSSARY 

adjacency matrix : a 0-1 matrix whose (i,j) entry indicates the absence or presence, 
respectively, of an arc joining vertex i to vertex j in a graph. 

adjacency set : the set of arcs emanating from a specified vertex. 

alternating path (in a matching): a path with edges that are alternately free and 
matched. 

arc list : a list of the arcs of a graph, presented in no particular order. 

assignment (from a set S' to a set T): a bijective function from S onto T. 

augmenting path (in a flow network): a directed path between two specified vertices 
in which each arc has a positive residual capacity. 

augmenting path (in a matching): an alternating path between two free vertices. 

backbone network: a collection of devices that interconnect vertices at which mes- 
sage exchanges occur in a communication network. 

blossom: an odd length cycle formed by adding an edge joining two even vertices on 
an alternating path. 

capacitated concentrator location problem: a network design problem in which 
a minimum cost configuration of concentrators and their connections to terminals is 
sought so that each concentrator’s total capacity is not exceeded. 

capacitated minimum spanning tree: a minimum cost collection of subtrees joined 
to a specified root vertex, in which the total amount of demand generated by each 
subtree is bounded above by a constant. 

capacitated network : a network in which arc is assigned a capacity. 

capacity (of an arc): the maximum amount of material that can flow along the arc. 

capacity (of a cut): for a cut [ S , 5], the sum of the capacities of arcs (i, j) with i £ S 
and j £ S. 

capacity (of a path): the smallest capacity of any arc on the path. 

capacity assignment problem: a network design problem in which links of different 
capacities are to be installed at minimum cost to support a number of point-to-point 
communication demands. 

complete matching: in a bipartite graph G = ( X U Y,E), a matching M in which 
each vertex of X is incident with an edge of M. 
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composite ( hybrid ) method: a heuristic algorithm that combines elements of both 
construction methods and improvement methods. 

construction method: a heuristic algorithm that builds a feasible solution, starting 
with a trivial configuration. 

cost (of a flow): ^ CijXij , where Cij is the cost and Xij is the flow on arc (i, j). 

cut (in a graph): the set of edges [5,5] in the graph joining vertices in 5 to vertices 
in the complementary set 5. 

directed network: a vertex set V and an arc set E , where each directed arc has an 
associated cost, length, weight, or capacity. 

directed out-tree: a tree rooted at vertex s such that the unique path in the tree 
from vertex s to every other vertex is a directed path. 

distance label: an estimate (in particular, an upper bound) on the shortest path 
length from the source vertex to each network vertex. 

even vertex (in an alternating path): given an alternating path P, a vertex on P that 
is reached using an even number of edges of P, starting from the origin vertex of P. 

exact algorithm : a procedure that produces a verifiable optimal solution to every 
problem instance. 

flow: a feasible assignment of material that satisfies flow conservation and arc capacity 
restrictions. 

forward star: a compact representation of a graph in which information about arcs 
leaving a vertex is stored using consecutive locations of an array. 

free edge (in a matching): an edge that does not appear in the matching. 

free vertex (in a matching): a vertex that is incident with no matched edges. 

heuristic algorithm: a procedure that produces a feasible, though not necessarily 
optimal, solution to every problem instance. 

improvement method: a heuristic algorithm that starts with a suboptimal solution 
(often randomly generated) and attempts to improve it. 

length (of a path): the sum of all costs appearing on the arcs of the path. 

linear assignment problem (LAP): an optimization problem in which an assign- 
ment is sought that minimizes an appropriate set-up cost. 

link capacity: an upper bound on the amount of traffic that a communication link 
can carry at any one time. 

linked adjacency list: a collection of singly-linked lists used to represent a graph. 

local access network: a network used to transfer traffic between the backbone net- 
work and the end users. 

matched edge (in a matching): an edge that appears in the matching. 

matched vertex (in a matching): a vertex that is incident with a matched edge. 

matching (in a graph): a set of pairwise nonadjacent edges in the graph. 

mate (of a matched vertex): in a matching, the other endpoint of the matched edge 
incident with the given vertex. 

maximum flow (in a network): a flow in the network having maximum value. 

maximum size matching: a matching having the largest size. 
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maximum spanning tree (of a network): a spanning tree of the network with max- 
imum cost. 

maximum weight matching: a matching having the largest weight. 

metaheuristic : a general-purpose heuristic procedure (such as tabu search, simulated 
annealing, genetic algorithms, or neural networks) for solving difficult optimization 
problems. 

minimum cost flow (in a network): a flow in the network having minimum cost. 

minimum cut (in a network): a cut in the network having minimum capacity. 

minimum spanning tree (of a network): a spanning tree of the network with mini- 
mum cost. 

negative cycle: a directed cycle of negative cost (or length). 

odd vertex (in an alternating path): given an alternating path P, a vertex on P that 
is reached using an odd number of edges of the path P, starting from the origin 
vertex of P. 

perfect matching: a matching in a graph in which each vertex of the graph is incident 
with exactly one edge of the matching. 

predecessor: relative to a rooted tree, the vertex preceding a given vertex on the 
unique path from the root to the given vertex. 

preflow: a relaxation of flow where inflow into a vertex can be greater than its outflow. 

pseudo flow: a relaxation of flow where inflow into a vertex need not be equal to its 
outflow. 

quadratic assignment problem ( QAP ): an optimization problem in which an as- 
signment is sought that minimizes the sum of set-up and interaction costs. 

reduced cost of arc (i, j): relative to given vertex potentials 7r, the quantity c- = 

Cij - 7T (i) + 7 r(j). 

residual capacity (of an arc): the maximum additional flow (with respect to a given 
flow) that can be sent on an arc. 

residual network: a network consisting of arcs with positive residual capacity. 

s-t cut: a cut [S', S] in which s € S and t € S. 

savings: the reduction in cost from joining two vertices directly compared to joining 
both to a central vertex. 

shortest path: a directed path between specified vertices having minimum total cost 
(or length). 

size (of a matching): the number of edges in the matching. 

survivable network : a network that can survive failures in some of its vertices or 
edges and still transfer a prespecified amount of traffic. 

traveling salesman problem (TSP): an optimization problem in which a fixed set 
of cities must be visited in some order at minimum total cost. 

two-phase method: a heuristic algorithm that implements a cluster first/route sec- 
ond philosophy. 

undirected network: a vertex set V and an edge set E, where each undirected edge 
has an associated cost, length, weight, or capacity. 

value of a flow: the total flow leaving the source vertex. 
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vehicle routing problem ( VHP ): an optimization problem in which a given set of 
customers must be serviced at minimum total cost, using a fleet of vehicles having 
fixed capacity. 

vertex potential : a quantity n (i) associated with each vertex i of a network. 
weight (of a matching): the sum of the weights of edges in the matching. 


10.1 MINIMUM SPANNING TREES 

In an undirected network, the minimum spanning tree problem is the problem of iden- 
tifying a spanning tree of the network that has the smallest possible sum of edge costs. 
This problem arises in a number of applications, both as a stand-alone problem and as 
a subproblem in more complex problem settings. It is assumed throughout this section 
that the network is connected. 


10.1.1 BASIC CONCEPTS 
Definitions: 

An undirected network is a weighted graph (§8.1.1) G = (V,E), where V is the set 
of vertices, E is the set of undirected edges, and each edge (i.j) G E has an associated 
cost (or weight , length ) c^-. Let n = \V\ and m = \E\. 

If T = (V,F) is a spanning tree (§9.2) of G = (V,E), then every edge in F C E is a 
tree edge and every edge in E — F is a nontree edge (or chord). 

A minimum spanning tree ( MST ) of G is a spanning tree of G for which the sum 
of the edge costs is minimum. 

A maximum spanning tree of G is a spanning tree of G for which the sum of the 
edge costs is maximum. 

A cut of G = ( V , E ) is a partition of the vertex set V into two parts, S and S = V — S. 
Each cut defines the set of edges [S, S] C E having one endpoint in S and the other 
endpoint in S. 

Facts: 

1 . Every spanning tree T of a network G with n vertices contains exactly n — 1 edges, 
and every two vertices of T are connected by a unique path. 

2. Adding an edge to a spanning tree of G produces a unique cycle, called a fundamental 
cycle (§9.2.1). 

3. Every cut [5, 5] is a disconnecting set of edges (§8.4.2). However, not every discon- 
necting set of edges can be represented as a cut [S, 5]; see Example 2. 

4. Removing an edge from a spanning tree of G produces two subtrees, on vertex sets S 
and S, respectively. The associated cut [, S , S'] is called a fundamental cut. 

5. Path optimality conditions: A spanning tree T* is a minimum spanning tree of G 
if and only if for each nontree edge (k, l) of G, Cij < Cu holds for all tree edges (i. j) in 
the fundamental cycle determined by edge ( k,l ). 
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6. Cut optimality conditions: A spanning tree T* is a minimum spanning tree of G if 
and only if for each tree edge (i. j) € T* , dj < Cki holds for all nontree edges ( k,l ) in 
the fundamental cut determined by edge (*, j). 

7 . If all edge costs are different, then the minimum spanning tree is unique. 

8. The minimum spanning tree can be unique even if some of the edge costs are equal; 
see Example 1. 

9 . Adding a constant to all edge costs of an undirected network does not change the 
minimum spanning tree(s) of the network. Thus, it is sufficient to have an algorithm 
that works when all edge costs are positive. 

10 . Multiplying each edge cost of an undirected network by — 1 converts a maximum 
spanning tree into a minimum spanning tree, and vice versa. Thus, it is sufficient to 
have algorithms to find a minimum spanning tree. 

Examples: 

1 . Part (a) of the following figure shows an undirected network G, with costs indicated 
on each edge. Part (b) shows a spanning tree T* of G. Adding the nontree edge (3, 5) 
to T* produces the fundamental cycle [3, 1,2,5, 3]; see part (c). Since each tree edge 
in this cycle has cost no more than that of the nontree edge (3,5), the path optimality 
condition is satisfied by edge (3,5). Similarly, it can be verified that the other nontree 
edges, namely (2,3), (4,5), and (5,6), satisfy the path optimality conditions, estab- 
lishing by Fact 5 that T* is a minimum spanning tree. By Fact 7 this is the unique 
minimum spanning tree. 



(c) (d) 


2 . For the tree edge (1,2) in part (b) of the figure of Example 1, the fundamental cut 
[S', S'] formed by deleting edge (1,2) has S = {1,3} and S = {2, 4, 5, 6}; see part (d) of 
the figure. This cut contains two nontree edges, (2, 3) and (3, 5). Since each such nontree 
edge has cost greater than or equal to that of the tree edge (1,2), the cut optimality 
condition is satisfied for edge (1, 2). Similarly, it can be verified that the other tree edges, 
namely (1,3), (2,4), (2,5), and (4,6), satisfy the cut optimality conditions, establishing 
by Fact 6 that T* is a minimum spanning tree. 
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3. The undirected network of part (a) of the following figure has 4 vertices and 5 edges, 
with the edge cost shown beside each edge. This network contains 8 spanning trees, 
which are listed in the table in part (b) of the figure. The spanning tree T5 achieves the 
minimum cost 7 among all the spanning trees and so is a minimum spanning tree. In 
fact, X5 is the unique minimum spanning tree, even though the edge costs are not all 
distinct. See Fact 8. 




Spanning Tree 
Edges 

Total 

Cost 

Ti 

(1,2), (1,3), (2,4) 

10 

t 2 

(1,2), (1,3), (3,4) 

10 

t 3 

(1,2), (2,3), (2,4) 

8 

t 4 

(1,2), (2,3), (3,4) 

8 

T s 

(1,2), (2,4), (3,4) 

7 

T s 

(1,3), (2,3), (2,4) 

10 

T 7 

(1,3), (2,3), (3,4) 

10 

T 8 

(1,3), (2,4), (3,4) 

9 


(a) (b) 

4. The set of edges F = {(2,3), (2, 4), (3, 4)} is a disconnecting set in the network G of 
Example 3, since removal of these edges disconnects G. However, there is no partition 
of the vertex set of G into nonempty sets 5 and S for which F = [5, 5 ] . 


10.1.2 ALGORITHMS FOR MINIMUM SPANNING TREES 


There are several greedy algorithms for constructing minimum spanning trees, based on 
the optimality conditions in §10.1.1, Facts 5 and 6. Each of these algorithms myopically 
(greedily) adds an edge to the current configuration based on only local information; 
nonetheless, these procedures are guaranteed to produce a minimum spanning tree. 

Definitions: 

The nearest neighbor operation takes as input a tree T* having vertex set 5 and 
produces a minimum cost edge (i. j) in the cut [5,5]. That is, Cij = min {c a b \ a £ 
5, b^S}. 

The merge operation takes as input an edge (i, j) whose two endpoints i and j belong 
to disjoint trees X, and X) and combines the trees into T) U X) U {(*, j)}. 

The graph G = (V, E) is assumed connected and has n vertices and m edges. 

Facts: 

1. Kruskal’s algorithm : This greedy algorithm (Algorithm 1) is based on the path 
optimality conditions (§10.1.1) and builds a minimum spanning tree by examining edges 
of E one by one in nondecreasing order of their costs. The edge being examined is added 
to the current forest if its addition does not create a cycle. (J. B. Kruskal, born 1928) 

2. Kruskal’s algorithm can be terminated once n — 1 edges have been added to X*. 

3. Computer code (in Fortran) that implements Kruskal’s algorithm can be found at 
the site: 

http : // www . mat . uc . pt/~eqvm/ cient if icos/f ortran/ codigos . html 
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Algorithm 1: Kruskal’s algorithm. 

input: connected undirected network G 
output: minimum spanning tree T* 

order the edges (ii,ji), {12,32), • • • , (im,jm) so that c ilh < c i2h <■■■ < c imjm 
T* := 0 

for k := 1 to m do 

if T* U {{ik, 3 k)} d° es not contain a cycle then T* := T* U {{ik,jk)} 


Algorithm 2: Prim’s algorithm. 

input: connected undirected network G, vertex 
output: minimum spanning tree T* 

T* := the tree consisting of vertex 
while |T*| < n — 1 do 

(z,j) := nearest-neighbor {T*) 

T* := T* U { (*, j) } 


4. Kruskal’s algorithm can be implemented using several data structures yielding dif- 
ferent time bounds: 

• [AhMaOr93] describes an implementation that runs in 0(m + nlogzr) time plus 

the time needed for sorting the m edges; 

• [Ta83] describes an improved implementation that runs in 0(ma{n, m)) time plus 

the time needed for sorting the m edges; here a(n, m) is the inverse Ackermann 
function which for all practical purposes is less than 5. 

5. Algorithm 1 was independently discovered by Kruskal (1956) and by H. Loberman 
and A. Weinberger (1957). 

6. Prim’s algorithm: This algorithm (Algorithm 2) is based on the cut optimality 
conditions (§10.1.1). It maintains a single tree T* , which initially consists of an arbitrary 
vertex © At each iteration, the algorithm adds the least cost edge emanating from T* 
until a spanning tree is obtained. (R. C.Prim, born 1921) 

7. Algorithm 2 was first proposed in 1930 by V. Jarnik. Later it was independently 
discovered by Prim (1957) and by E. W. Dijkstra (1959). 

8. Running times of several implementations of Prim’s algorithm are shown in the 
following table. See [AhMaOr93] for a discussion of these implementations. 


data structure 

running time 

binary heap 
d-heap 

Fibonacci heap 

O(mlogn) 

0(m log d n ), with d = max{2, |"™]} 
0{m + nlogn) 


9. A modification of Prim’s algorithm has running time 0(m log 6(m,n)), where the 
function b{m, n) grows very slowly (for all practical purposes is less than 5) [GaEtal86] . 
This is currently the theoretically fastest algorithm for solving the minimum spanning 
tree problem. 
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Algorithm 3: Sollin’s algorithm. 

input: connected undirected network G 
output: minimum spanning tree T* 

T* := forest of all vertices of G, but no edges 
while |T*| < n — 1 

let Ti , X 2 , . . . , T p be the trees in the forest T* 
for k := 1 to p 

( ik,jk ) ■= nearest jieighbor(Tk) 

for k := 1 to p 

if ik and jk belong to different trees then 

merge(ik, jk) 

T* := T* U {(ifc, jfc)} 


10. Computer codes (in Fortran) implementing Prim’s algorithm can be found at the 
following three sites: 

http : //www.netlib. org/toms/479 
http : //www.netlib. org/toms/613 

http : //www.mat ,uc .pt/~eqvm/ cientif icos/f ortran/ codigos .html 

11 . Solliirs algorithm : This greedy algorithm (Algorithm 3) is also based on the cut 
optimality conditions (§10.1.1). It starts with a forest of n trees, each consisting of a 
single vertex, and builds a minimum spanning tree by repeatedly adding edges to the 
current forest. At each iteration a least cost edge emanating from each tree is added, 
leading to the merging of certain trees. 

12. Each iteration of Algorithm 3 reduces the number of trees in the forest T* by at 
least half. 

13. Sollin’s algorithm performs 0(log n) iterations and can be implemented to run in 
O(mlogn) time; see [AhMaOr93]. 

14. A variation of Sollin’s algorithm that runs in time 0(m log log n) can be found in 
[Ya75]. 

15. The origins of Algorithm 3 can be traced to O. Boruvka (1926), who first formulated 
the minimum spanning tree problem in the context of electric power networks. This 
algorithm was independently proposed in 1938 by G. Choquet for points in a metric 
space and by G. Sollin in 1961 for arbitrary networks. 

16. Sollin’s algorithm lends itself to a parallel implementation (see §10.1.3), though 
care must be taken when edge costs are not distinct in order to ensure that no cycles 
are produced. 

17. Computational studies have found that the Prim and Sollin algorithms consistently 
outperform Kruskal’s algorithm. Prim’s algorithm is faster when the network is dense, 
whereas Sollin’s algorithm is faster when the network is sparse. 

18. An excellent discussion of the history of the minimum spanning tree problem is 
provided in [GrHe85]. 

19. An important variant of the minimum spanning tree problem places constraints 
on the number of edges incident with a vertex in a candidate spanning tree. Such 
degree-constrained minimum spanning trees are investigated in [G1K175] and [Vo89]. 
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20. Another variant is the capacitated minimum spanning tree problem, which arises 
in the design of local access telecommunication networks. In this problem, a feasible 
spanning tree is one rooted at a specified central vertex such that the total traffic 
(number of calls) generated by each subtree connected to the central vertex does not 
exceed a known capacity. A feasible spanning tree having minimum total cost is then 
sought. (See §10.6.1 for further details.) 

Examples: 

1. For the network shown in part (a) of the following figure, ordering edges by nonde- 
creasing cost produces the following sequence of edges: (2, 4), (3, 5), (3, 4), (2, 3), (4, 5), 
(1,2), (1,3). Kruskal’s algorithm adds the edges (2,4), (3,5), (3,4) to T*; discards the 
edges (2,3) and (4,5); then adds the edge (1,2) to T* and terminates. Part (b) of the 
figure shows the resulting minimum spanning tree, having total cost 80. 




(a) (b) 

2. Prim’s algorithm (Algorithm 2) is applied to the network of part (a) of the figure 
of Example 1, starting with the initial vertex i 0 = 3. The minimum cost edge out of 
vertex 3 is (3,5), so T* = {(3,5)}. Next, the minimum cost edge emanating from T* 
is (3, 4), giving T* = {(3, 5), (3, 4)}. Subsequent iterations add the edges (2, 4) and (1, 2), 
producing the minimum spanning tree T* = {(3,5), (3,4), (2,4), (1,2)}. Starting from 
any other initial vertex iq would give the same result. 

3. To apply Sollin’s algorithm (Algorithm 3) to the network of part (a) of the figure of 
Example 1, begin with a forest containing five trees, each consisting of a single vertex. 
Part (a) of the following figure shows the least cost edge emanating from each of these 
trees. One iteration of Algorithm 3 produces the two trees shown in part (b) of the 
following figure. The least cost edge emanating from either of these two trees is (3,4). 
Adding this edge completes the minimum spanning tree shown in part (c) of this figure. 


35 ©-->-<-0 

0 

0 ^ -0 

(a) 



(b) 



1 0.1 .3 PARALLEL ALGORITHMS 

Sollin’s algorithm (§10.1.2) can be easily parallelized in EREW (exclusive-read, exclu- 
sive-write) PRAM (parallel random-access machine). (See §16.1.4.) This algorithm 
assigns a processor to each edge and each vertex of the network [KiLe88] . 
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Facts: 

1. Sollin’s algorithm performs O(logn) iterations. 

2. In each iteration, every component finds a least cost edge emanating from it in 
O(logn) time. To do this, each vertex finds a least cost edge emanating from it to a 
vertex in a different component. Next, a minimization is done over all vertices of the 
given component. 

3. In each iteration, components that are connected by the newly found edges are 
merged using a procedure called recursive doubling. This operation can also be done 
in 0(log?i) time. 

4. Overall the running time of the resulting algorithm is 0(log 2 n) using 0(m) proces- 
sors. 

5. The most work-efficient parallel algorithm currently known for solving the minimum 
spanning tree problem is given in [CoKeTa94]. 


10.1.4 APPLICATIONS 

Minimum spanning tree problems arise both directly and indirectly. For direct appli- 
cations, the points in a given set are to be connected using the least cost collection of 
edges. For indirect applications, creative modeling of the original problem recasts it as 
a minimum spanning tree problem. 

Applications: 

1 . Designing physical systems: A minimum cost network is to be designed to connect 
geographically dispersed system components. Each component is represented by a ver- 
tex, with potential network connections between vertices represented by edges. A cost 
is associated with each edge. 

2. Examples of Application 1 occur in the following: 

• Connect terminals in cabling the panels of electrical equipment in order to use 

the least total cost of wire. 

• Construct a pipeline network to connect a number of towns using the smallest 

possible total cost of pipeline. 

• Link isolated villages in a remote region, which are connected by roads but not 

yet by telephone service. The problem is to determine along which stretches of 
roads to place telephone lines to link every pair of villages, using the minimum 
total miles of installed lines. 

• Construct a digital computer system, composed of high-frequency circuitry, when 

it is important to minimize the length of wires between different components 
to reduce both capacitance and delay line effects. 

• Connect a number of computer sites by high-speed lines. Each line is available for 

leasing at a certain monthly cost, and a configuration is required that connects 
all the sites at minimum overall cost. 

• Design a backbone network of high-capacity links that connect switching devices 

to support internet traffic. A minimum cost backbone network that maintains 
acceptable throughput is required. 
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3 . Clustering : Objects having k measurable characteristics are to be clustered into 
groups of “similar” objects. First, construct an undirected network, where each object 
is represented by a vertex and every two distinct vertices are joined by an edge. The 
cost of edge (i, j) is the distance (in fc-dimensional space) between the vectors for 
objects i and j. Applying Kruskal’s algorithm to this network then yields a hierarchy of 
partitions of the vertex set; each partition is defined by the trees comprising the forest 
obtained at each iteration of Kruskal’s algorithm. This hierarchy is then used to define 
clusters of the original objects. 

4 . Computation of minimum spanning trees sometimes arises as a subproblem in a 
larger optimization problem. For example, one heuristic approach to the traveling 
salesman problem (§10.7.1) involves the calculation of minimum spanning trees. 

5 . Optimal message passing: An intelligence service has agents operating in a non- 
friendly country. Each agent knows some of the other agents and has in place procedures 
for arranging a rendezvous with someone he knows. For each such possible rendezvous, 
say between agent i and agent j, any message passed between these agents will fall 
into hostile hands with a certain probability p- t j . The group leader wants to transmit 
a confidential message among all the agents while maximizing the probability that no 
message is intercepted. 

If the agents are represented by vertices and each possible rendezvous by an edge, 
then in the resulting graph G a spanning tree T is required that maximizes the prob- 
ability that no message is intercepted, given by — Pij). Such a tree can be 

found by defining the cost of each edge (i,j) as log(l — Pij) and solving a maximum 
spanning tree problem. 

6. All-pairs minimax path problem: In this variant of the shortest path problem 
(see §10.3.1), the value of a path P is the maximum cost edge in P. The all-pairs 
minimax path problem is to determine a minimum value path between every pair of 
vertices in a network G. It can be shown that if T* is a minimum spanning tree of G, 
then the unique path in T* between any pair of vertices is also a minimax path between 
that pair of vertices. 

7 . Examples of Application 6 arise in the following contexts: 

• Determine the trajectory of a spacecraft that keeps the maximum temperature 

of the surface as small as possible. 

• When traveling through a desert, select a route that minimizes the length of the 

longest stretch between rest areas. 

• A person traveling in a wheelchair desires a route that minimizes the maximum 

ascent along the path segments of the route. 

8. Measuring homogeneity of bimetallic objects: In this application minimum spanning 
trees are used to determine the degree to which a bimetallic object is homogenous 
in composition. First, the composition of the bimetallic object is measured at a set 
of sample points. A network is then constructed with vertices corresponding to the 
sample points and with an edge connecting physically adjacent sample points. The cost 
of edge (i, j) is the product of the physical (Euclidean) distance between sample points i 
and j, and a homogeneity factor between 0 and 1. The homogeneity factor is 0 if the 
composition of the corresponding samples is identical, and is 1 if the composition is very 
different. This cost structure gives greater weight to two points if they have different 
compositions and are far apart. Then the cost of the minimum spanning tree provides 
an overall measure of the homogeneity of the object. 
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9. Additional applications, with reference sources, are given in the following table. 


application 

references 

two-dimensional storage schemes 

chemical physics 

manufacturing 

network design 

network reliability 

pattern classification 

picture processing 

automatic speech recognition 

numerical taxonomy 

[AhMaOr93] , [AhEtal95] 
[AhMaOr93] 

[EvMi92] 

[AhMaOr93] 

[AhEtal95] 

[GrHe85], [AhMaOr93] 
[GrHe85], [AhMaOr93] 
[GrHe85] 

[GrHe85] 


10.2 MATCHINGS 

In an undirected network, the maximum matching problem is to find a set of nonadjacent 
edges that has the largest total size or weight. This discrete optimization problem arises 
in a number of applications, often involving the optimal pairing of a set of objects. 


10.2.1 BASIC CONCEPTS 
Definitions: 

Let G = (V, E) be an undirected network with vertex set V and edge set E (see §10.1.1). 
Assume that G contains neither loops nor multiple edges. Each edge e = (i,j) £ E has 
an associated weight w e = Wij. Let n = \V\ and rn = \E\. 

The degree of vertex v £ V in G is the number of edges in G that are incident with v, 
written deg(v). (See §8.1.1.) 

A matching in G = (V,E) is a set M C E of pairwise nonadjacent edges (§8.1.1). 

A perfect matching in G = (V, E) is a matching M in which each vertex of V is 
incident with exactly one edge of M. 

The size (cardinality) of a matching M is the number of edges in M, written \M\. 
The weight of a matching M is wt(M) = J2 eeM w e- 

A maximum size matching of G is a matching M having the largest size \M\. 

A maximum weight matching of G is a matching M having the largest weight wt(M). 

Relative to a matching M in G = (V, E), edges e £ M are matched edges, while edges 
e £ E — M are free edges. Vertex v is matched if it is incident with a matched edge; 
otherwise vertex v is free. 

Every matched vertex v has a mate, the other endpoint of the matched edge incident 
with v. 

With respect to a matching M, the weight wt(P) of path P is the sum of the weights 
of the free edges in P minus the sum of the weights of the matched edges in P. 

An alternating path has edges that are alternately free and matched. An augmenting 
path is an alternating path that starts and ends at a free vertex. 
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Facts: 


1. Matchings are useful in a wide variety of applications, such as assigning personnel 
to jobs, target tracking, crew scheduling, snowplowing streets, scheduling on parallel 
machines, among others (see §10.2.4). 

2. In a matching M, each vertex of G has degree 0 or 1 relative to the edges in M. In 
a perfect matching M, each vertex of G has degree 1 relative to the edges in M. 

3. If M is any matching in G, then \M\ < [f]. 

4. Every augmenting path has an odd number of edges. 

5. If M is a matching and P is an augmenting path with respect to M, then the 
symmetric difference (§1.2.2) MAP is a matching of size \M\ + 1. 

6. If M is a matching and P is an augmenting path with respect to M, then wt(MAP) 
= wt(M) + wt(P). 

7. Augmenting path theorem: M is a maximum size matching if and only if there is 
no augmenting path with respect to M . 

8. Fact 7 was obtained independently by C. Berge (1957) and by R. Z. Norman and 
M. O. Rabin (1959). This result was also recognized in an 1891 paper of J. Petersen. 

9. Suppose M is a matching having maximum weight among all matchings of a fixed 
size k. If P is an augmenting path of maximum weight, then MAP is a matching having 
maximum weight among all matchings of size k + 1. 

10. Suppose paths P\, P A ; ■ ■ ■ , Pk are obtained as in Fact 9 by augmenting along a 
maximum weight path. Then wt(P\) > u>t(P 2 ) > • • • > wt(Pk). 

11. The number of perfect matchings of the complete graph (§8.1.3) A' 2rl on 2 n vertices 
is (2n — 1)! ! = 1 • 3 • 5 . . . (2n — 1). 

12. An historical perspective on the theory of matchings is found in [P192] . 

Examples: 

1. Part (a) of the following figure displays a network G with the weight w e shown next 
to each edge e. 



(a) (b) 

The matching M\ = {(1, 2), (3, 5)} of size 2 is also shown, with the matched edges 
highlighted. The mate of vertex 1 is vertex 2, and the mate of vertex 5 is ver- 
tex 3. The weight of M\ is wt{M\) = 7. Relative to the matching Mi, vertices 4 
and 6 are free vertices, and an augmenting path P from 4 to 6 is given by the set 
of edges P = {(1,4), (1,2), (2, 3), (3, 5), (5, 6)}. Here wt(P) = l + 4 + 3-2~5 = l 
and (as guaranteed by Fact 4) path P has an odd number of edges. The matching 
M 2 = MiAP = {(1, 2), (3, 5)} A {(1,4), (1, 2), (2,3), (3, 5), (5,6)} = {(1,4), (2, 3), (5, 6)} 
is a perfect matching and is highlighted in part (b) of the figure. There are no free ver- 
tices relative to matching M 2 and no augmenting paths, so M 2 is a maximum size 
matching of G. There are other maximum size matchings, such as {(1,4), (2, 5), (3, 6)} 
and {(1,2), (4, 5), (3, 6)}. 
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2 . Part (a) of the following figure shows a matching Mi of size 1, with wt(M\) = 7. 
Since edge (2, 5) has maximum weight among all edges, Mi is a maximum weight 
matching of size 1. Relative to Mi the augmenting path Pi = {(1, 5), (2, 5), (2, 3)} 
has weight 6 + 4 — 7 = 3, whereas the augmenting path {(3,6)} has weight 1. It can 
be verified that Pi is a maximum weight augmenting path relative to M\. By Fact 9, 
M 2 = M 1 AP 1 = {(1, 5), (2, 3)} is a maximum weight matching of size 2 in the network, 
with wt(M 2 ) = 10; see part (b) of the figure. 



(a) (b) 

Relative to M 2 there are several augmenting paths between the free vertices 4 and 6: 
Qi = {(1,4), (1,5), (5,6)}, wt(Qi) = 1 + 3 - 6 = -2, 

Q 2 = {(1,4), (1,5), (2, 5), (2,3), (3,6)}, wt{Q 2 ) = 1 + 7 + 1 - 6 - 4 = -1, 

Qs = {(4, 5), (1, 5), (1, 2), (2, 3), (3, 6)}, wt(Q 3 ) = 5 + 2 + 1 - 6 - 4 = -2. 

The maximum weight augmenting path is Q 2 and so (by Fact 9) M3 = AI 2 AQ 2 = 

{(1, 4), (2, 5), (3, 6)} is a maximum weight matching of size 3 in the network with 
wt(M 3 ) = 9. Overall, the maximum weight matching in G is M2, as expected since 
all augmenting paths relative to M 2 have negative weight (see Fact 10). 


10.2.2 MATCHINGS IN BIPARTITE NETWORKS 

In this section, algorithms are described for finding maximum size and maximum weight 
matchings in bipartite networks (§8.1.3). Bipartite networks arise in a number of appli- 
cations, such as in assigning personnel to jobs or tracking objects over time. Moreover, 
the algorithms developed for the case of bipartite networks are considerably simpler 
than those needed for the case of general networks (§10.2.3). 

Definitions: 

Let G = (XU7,£) be a bipartite network with n vertices and m edges, and edge 
weights w xy . 

If S C X then r(S') = { y € Y \ (x, y) € E for some x G X } is the set of vertices in Y 
adjacent to some vertex of X. 

A complete matching from X to Y in G = ( X U Y, E) is a matching M in which each 
vertex of X is incident with an edge of M. 

The directed two-terminal How network G' associated with G = ( X U Y, E ) is 
defined by adding new vertices s and t, as well as arcs (s, x ) for each x € X and 
arcs (y,t) for each y G Y. All other arcs (x,y) of G' correspond to edges (x,y) of G 
where x € X and y £ Y. Every arc of G' has capacity 1. 
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Algorithm 1 : Bipartite matching algorithm. 

input: undirected bipartite network G = ( X U Y, E) 
output: maximum size matching M 
M := 0 

while true do 

let Si consist of all free vertices of X 
mark all vertices of Si as seen 
while there are unseen vertices of G do 
S 2 := { y | (a;, y) £ E, x £ Si, y unseen } 
if some y £ S% is free then 

an augmenting path to y has been found 
mark all remaining vertices of G as seen 
else mark all vertices of S 2 as seen 

Si := { x | (y, x ) £ M, y £ S 2 , x unseen } 
mark all vertices of Si as seen 
if an augmenting path P has been found then M := MAP 
else terminate with matching M 


Facts: 

1. Hall’s theorem: G = ( X U Y, E) has a complete matching from X to Y if and only 
if |r(S)| > |S| holds for every SCI. In words, a complete matching exists precisely 
when every set of vertices in X is adjacent to at least an equal number of vertices in Y. 
(Philip Hall, 1904-1982.) 

2. Sufficient condition for a complete matching: Suppose there exists some k such 
that deg(x) > k > deg(y) holds in G = (X U Y, E) for all x £ X and y £ Y. Then G 
has a complete matching from X to Y. 

3. There is a one-to-one correspondence between matchings of size k in G and integral 
flows (§10.4.1) of value k in the associated two-terminal flow network G' . 

4. A maximum flow in G' , and thereby a maximum size matching of G, can be found 
in 0(m^/n) time. 

5. Suppose that costs are added to the two-terminal flow network G' , using Cy = 0 if 
i = s or j = t, and Cij = —uiij otherwise. By starting with the flow (§10.5.1) x = 0, 
the successive shortest path algorithm (§10.5.2) can be repeatedly applied to G' until a 
shortest augmenting path has negative cost. The resulting minimum cost flow will yield 
(via Fact 3) a matching with maximum weight. 

6. Bipartite matching algorithm: This method (Algorithm 1), based on §10.2.1 Fact 7, 
produces a maximum size matching of the bipartite network G = ( X U Y,E). Each 
iteration involves a modified breadth first search of G, starting with the free vertices in 
the set X. All vertices of G are structured into levels that alternate between free and 
matched edges. 

7. Algorithm 1 can be implemented to run in 0{mn) time. 

8. Bipartite weighted matching algorithm: This method (Algorithm 2), based on 
§10.2.1 Facts 9 and 10, produces a maximum weight matching of G = (A U Y,E). Each 
iteration develops a longest path tree in G, rooted at the set of free vertices in X. The 
tentative largest weight of a path from a free vertex in X to vertex j is maintained in 
the label d(j). 

9. Algorithm 2 can be implemented to run in 0{mn ) time. 
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Algorithm 2: Bipartite weighted matching algorithm. 

input: undirected bipartite network G = ( X U Y,E), weights w e 
output: maximum weight matching M 

M := 0 

while true do 

let Si consist of all free vertices of X 
d(j) := 0 for j £ Si, d(j) := — oo otherwise 

while Si yf 0 do 

S 2 := 0 

for (x, y) £ E — M with x £ Si do 
if d(x) + w xy > d(y) then 

d(y) '■= d(x) + w xy , S 2 '■= S 2 U {y} 

Si '■= 0 

for (y, x ) £ M with y £ S 2 do 
if d(y) — w yx > d(x) then 

d(x) := d(y) - w yx , Si Si U {x} 
y := a free vertex with maximum label d{y) 

P := the associated augmenting path 
if d(y) > 0 then M := MAP 
else terminate with matching M 


10. Stable marriage problem: A variation of the bipartite matching problem is the 
stable marriage problem , defined for a set X of n men and n women. Each person has 
a strict ranking of the n people of the opposite sex. A perfect matching is stable if it 
is impossible to find a man and a woman who are not matched to each other, yet each 
of these two prefers one another to their respective mates. For every set of rankings, a 
stable matching exists and can be found using a greedy algorithm [AhMaOr93] . 


Examples: 


1. Drug testing: A drug company is testing n antibiotics on n volunteer patients 
in a hospital. Some of the patients have known allergic reactions to certain of these 
antibiotics. To determine whether there is a feasible assignment of the n different 
antibiotics to n different patients, construct the bipartite network G = ( X U Y,E), 
where X is the set of antibiotics and Y is the set of patients. An edge (i,j) £ E exists 
when patient j is not allergic to antibiotic i. A complete matching of G is then sought. 


2. Part (a) of the following figure shows a bipartite graph G with X = {1,2, 3, 4} and 
Y = {a, b, c, d}. 



(a) 



(b) 


Using Fact 1, there cannot be a complete matching from X to Y: if S = {1,2,4} then 
r(S) = {a, c} and |r(S)| < IS 1 !. There is, however, a (maximum) matching of size 3: for 
example, {(1, c), (2, a), (3, d)}. 
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3 . Part (b) of the figure of Example 2 shows a bipartite graph G with X = {1,2,3} 
and Y = {a,b,c,d}. Since deg(x) > 2 > deg(y ) holds for all x £ X and y GY, there 
must exist a complete matching from X to Y. One such complete matching is given by 
{(1, a), (2, c), (3, 6)}. 

4 . Algorithm 1 is used to find a maximum size matching in the bipartite graph of 
part (a) of the following figure. 



(a) (b) (c) (d) 

Relative to the initial empty matching, all vertices of X are free so Si = {1,2, 3, 4}, 
giving *S *2 = {a, 6, c}. In particular, vertex a £ S 2 is free and an augmenting path to a is 
P = {(l,a)}. The resulting matching is M = {(1, a)}, shown in part (b) of the figure. 

The second iteration of Algorithm 1 starts with Si = {2,3,4}, giving S 2 = {a, b , c}. 
An augmenting path to the free vertex b is P = {(2, b)}, resulting in M = {(1, a), (2, &)}; 
see part (c) of the figure. 

At the next iteration, S 1 = {3,4} and S 2 = {a, b}. Since both vertices of S 2 are 
matched, the algorithm continues with Si = {1,2} and S 2 = {c}. Since c 6 ^ is 
free, with augmenting path P = {(3, a), (a, 1), (1, c)}, the new matching produced is 
M = {(1, c), (2, b), (3, a)}; see part (d) of the figure. 

The fourth iteration produces Si = {4}, S 2 = {b}; S 1 = {2}, S 2 = {a, c}; and 
finally Si = {1,3}, S 2 = 0. No further augmenting paths are found, and the algorithm 
terminates with the maximum size matching M = {(1, c), (2, b), (3, a)}. 

5 . Algorithm 2 is used to find a maximum weight matching in the bipartite network of 
part (a) of the following figure. 



(a) (b) (c) (d) 

Relative to the initial empty matching, all vertices of X are free so Si = {1, 2, 3}, with 
d(l) = d(2) = d(3) = 0. The labels on vertices a, 6, c are updated to d{a) = 6, d{b) = 4, 
(2(c) = 5, giving S 2 = {a,b,c}. Since M = 0 no further updates occur. The free 
vertex a has maximum label, and the associated path Pi = {(3, a)} has wt(Pi) = 6. 
The resulting matching M = {(3, a)} is shown in part (b) of the following figure; it 
represents the largest weight matching of size 1, with wt(M) = 6. 
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The second iteration starts with S} = {1,2}. The labels on vertices a,b,c are 
then updated to d(a) = 4, d(b) = 4, d(c) = 5, so S 2 = {a,b,c}. Using the matched 
edge (a, 3), vertex 3 has its label updated to d( 3) = —2 and Si = {3}. No further 
updates occur, and free vertex c with maximum label d(c) = 5 is selected. This label 
corresponds to the augmenting path P 2 = {(2,c)}, with wt(P 2 ) = 5. The new matching 
is M = {(2, c), (3, a)}, with wt(M) = 11; see part (c). 

At the third iteration, Si = {1} and vertices a, b receive updated labels d(a) = 4, 
d(b) = 1. 

Subsequently, updates are made to produce d( 3) = —2, d(c) = 3, d( 2) = —2, 
d(b) = 2. 

Finally, the free vertex b is selected with d{b) = 2, corresponding to the augmenting 
path P 3 = {(1, a), (a, 3), (3, c), (c, 2), (2, b)} with wt(P 3 ) = 2. This produces the max- 
imum weight matching M = {(1, a), (2, b), (3, c)}, with wt(M ) = 13; see part (d). As 
predicted by Fact 10 of §10.2.1, the weights of the augmenting paths are nonincreasing: 
wt(Pi) > wt(P 2 ) > wt(P 3 ). 


10.2.3 MATCHINGS IN NONBIPARTITE NETWORKS 

This section covers matchings in more general (nonbipartite) networks. Algorithms 
for constructing maximum size and maximum weight matchings are considerably more 
intricate than for bipartite networks. The important new concept is that of a “blossom” 
in a network. 

Definitions: 

Suppose P is an alternating path from a free vertex s in network G = (V, E). Then a 
vertex v on P is even (outer) if the subpath P sv of P joining s to v has even length. 
Vertex v on P is odd (inner) if P sv has odd length. 

Suppose P is an alternating path from a free vertex s to an even vertex v and edge 
(v, w) £ E joins v to another even vertex w on P. Then PU{(», w)} contains a unique 
cycle, called a blossom. 

A shrunken blossom results when a blossom B is collapsed into a single vertex b, 
whereby every edge (x,y) with x /GB and y £ B is transformed into the edge (x,b). 
The reverse of this process gives an expanded blossom. 

Facts: 

1. Every blossom B has odd length 2k + 1 and contains k matched edges, for some 
k > 1. 

2. A bipartite network contains no blossoms. 

3. Edmonds’ theorem: Suppose network G B is formed from G by collapsing blossom B. 
Then G B contains an augmenting path if and only if G does. (J. Edmonds, 1965.) 

4. General matching algorithm: This method (Algorithm 3) , based on Fact 7 of §10.2.1, 
produces a maximum size matching of G. At each iteration, a forest of trees is grown, 
rooted at the free vertices of G, to find an augmenting path. As encountered, blossoms B 
are shrunk, with the search continued in the resulting network G B . 
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Algorithm 3: General matching algorithm. 

input: undirected network G = (V, E) 
output: maximum size matching M 

M := 0 

{Start iteration} 

mark all free vertices as even 

mark all matched vertices as unreached 

mark all free edges as unexamined 

while there are unexamined edges ( v,w ) and no augmenting path is found 
mark (v, w) as examined 
{Case 1} 

if v is even and w is unreached then 
mark w as odd and its mate z as even 
extend the forest by (v,w) and the matched edge ( w,z ) 

{Case 2} 

if v, w are even and they belong to different subtrees then 
an augmenting path has been found 

{Case 3} 

if v, w are even and they belong to the same subtree then 
a blossom B has been found 
shrink B to an even vertex b 

if an augmenting path P has been found then 
M := MAP 
go to {Start iteration} 
else terminate with matching M 


5. Algorithm 3 was initially proposed by Edmonds [Ed65a] with a time bound of 0(n 4 ). 

6. An improved implementation of Algorithm 3 runs in 0(nm) time. 

7. There are other algorithms for maximum size matchings in nonbipartite networks: 

• an algorithm of Gabow [Ga76], which runs in time 0(n 3 ); 

• an algorithm of Micali and Vazirani [MiVa80], that runs in 0(m,y/n) time. 

Computer codes for these algorithms (in C, Pascal, and Fortran) can be found at these 
sites: 

ftp : //dimacs . rutgers . edu/pub/netf low/matching/ 
ftp://ftp.zib. de/pub/Packages/mathprog/matching/ 

8. General weighted matching algorithms: More complicated algorithms are required 
for solving weighted matching problems. The first such algorithm, also involving blos- 
soms, was developed by Edmonds [Ed65b] and has a time bound of 0(n 4 ). 

9. Improved algorithms exist for the weighted matching problem, with running times 
0(n 3 ) and O(nmlogn) respectively. Code (in C) for the first of these algorithms can 
be found at these sites: 

ftp : //dimacs . rutgers . edu/pub/netf low/matching/ 
ftp://ftp.zib. de/pub/Packages/mathprog/matching/ 
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Examples: 

1. In part (a) of the following figure, P = {(1,2), (2, 3), (3, 4), (4, 5)} is an alternating 
but not augmenting path, relative to the matching M = {(2, 3), (4, 5)}. 



Relative to this path, vertices 1, 3, 5 are even while vertices 2, 4 are odd. Since (3, 5) 
is an edge joining two even vertices on P, the blossom B = {(3, 4), (4, 5), (5, 3)} is 
formed. On the other hand, Q = {(1, 2), (2, 3), (3, 5), (5, 4), (4, 6)} is an augmenting 
path relative to M so that MAP = {(1, 2), (3, 5), (4, 6)} is a matching of larger size 
in fact a matching of maximum size. Notice that relative to path Q, vertices 1, 3, 4 are 
even while vertices 2, 5, 6 are odd. 


2 . Shrinking the blossom B relative to path P in part (a) of the figure of Exam- 
ple 1 produces the network G B shown in part (b) of that figure. The path P B = 
{(1, 2), (2, b), (6, 6)} is now augmenting in G B . By expanding P B so that (2, 3) remains 
matched and (4,6) remains free, the augmenting path Q = {(1, 2), (2, 3), (3, 5), (5, 4), 
(4, 6)} in G is obtained. 

3 . Algorithm 3 is applied to the nonbipartite network shown in part (a) of the following 
figure. Suppose the matching M = {(3,4), (6,8)} of size 2 is already available. 





0 — 0 — ©—“© 



(d) 


Iteration 1 : The free vertices 1, 2, 5, 7 are marked as even, and the matched vertices 3, 4, 
6, 8 are marked as unreached. The initial forest consists of the isolated vertices 1, 2, 5, 7. 
• If the free edge (2, 3) is examined, then Case 1 applies, so vertex 3 is marked odd 
and vertex 4 even; the free edge (2,3) and the matched edge (3,4) are added 
to the forest. 


© 2000 by CRC Press LLC 


• If the free edge (4,7) is examined, then Case 2 applies, and the augmenting 

path P = {(2, 3), (3,4), (4, 7)} is found. Using P the new matching M = 
{(2, 3), (4, 7), (6, 8)} of size 3 is obtained; see part (b) of the figure. 

Iteration 2: The forest is initialized with the free (even) vertices 1, 5. 

• If the free edge (1, 2) is examined, then Case 1 applies, so vertex 2 is marked odd 

and vertex 3 even; edges (1,2) and (2,3) are added to the forest. 

• Examining in turn the free edges (3, 4) and (7, 6) makes 4, 6 odd vertices and 7, 8 

even. Edges (3, 4), (4, 7), (7, 6), (6, 8) are then added to the subtree rooted at 1. 

• If edge (7, 8) is examined, Case 3 applies and the blossom B = {(7, 6), (6, 8), (8, 7)} 

is detected and shrunk; part (c) of the figure shows the resulting G B . The 
current subtree rooted at 1 now becomes {(1, 2), (2, 3), (3, 4), (4, b)}. 

• If the free edge ( b , 5) is examined, then Case 2 applies and the augmenting path 

{(1, 2), (2, 3), (3, 4), (4, 6), (b, 5)} is found in G B . The corresponding augment- 
ing path in G is P = {(1, 2), (2, 3), (3, 4), (4, 7), (7, 8), (8, 6), (6, 5)}. Forming 
MAP produces the new matching {(1, 2), (3, 4), (7, 8), (5, 6)}, a maximum size 
matching; see part (d) of the figure. 


10.2.4 APPLICATIONS 


Matching problems, in both bipartite and nonbipartite networks, are useful models in a 
number of applied areas. This subsection presents some representative applications of 
matchings. 

Applications: 

1. Linear assignment problem : There are n applicants to be assigned to n jobs, with 
each job being filled with exactly one applicant. The weight Wij measures the suitability 
of applicant i for job j. Finding a valid assignment with the best overall weight is a 
weighted matching problem on the bipartite network G = ( X U Y,E), where X is the 
set of applicants and Y is the set of jobs. 

2. Personnel assignment: Pairs of pilots are to be assigned to a fleet of aircraft serving 
international routes. Pilots i and j are considered compatible if they are fluent in a 
common language and have comparable flight training. Form the network G whose 
vertices represent the pilots and with edges between compatible pairs of pilots. The 
problem of flying the largest number of aircraft with compatible pilots can then be 
solved as a maximum size matching problem on G. 

3 . Other examples of Application 2 occur in assigning police officers sharing beats, 
matching pairs of compatible roommates, and assigning pairs of employees with com- 
plementary skills to specific projects. 

4 . Pruned chessboards: Several squares (2k in all) are removed from an n x n chess- 
board, yielding the pruned chessboard V . Is it then possible to cover the squares of V 
using nonoverlapping dominoes, with no squares left uncovered? This can be formulated 
as a matching problem on the bipartite network G = (R U B,E), where R is the set 
of red squares and B is the set of black squares in V . An edge joins r £ R to b £ B 
if squares r and b share a common side. Each set of nonoverlapping dominoes on V 
corresponds to a matching in G. 
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All squares of V can be covered using nonoverlapping dominoes if and only if the 

2 

maximum size matching in G has size n 2 — k. More generally, the maximum size 
matching in G explicitly provides a way to cover the maximum number of squares of V 
using nonoverlapping dominoes. 

5. Target tracking: The movements of n objects (such as submarines or missiles) are to 
be followed over time. The locations of the set of objects are known at two distinct times, 
though without identification of the individual objects. Suppose X = {xi,X 2 , ■ • • ,x n } 
and Y = {yi, V 2 , ■ ■ ■ , Vn} represent the spatial coordinates of the objects detected at 
times t and t + At. If At is sufficiently small, then the Euclidean distance between a 
given object’s position at these two times should be relatively small. To aid in identifying 
the objects (as well as their velocities and directions of travel), a pairing between set X 
and set Y is desired that minimizes the sum of Euclidean distances. 

This can be formulated as a maximum weight matching problem on the complete 
bipartite network G — (All Y, E), where the edge (i,j) indicates pairing position Xi with 
position yj. The weight of this edge is the negative of the Euclidean distance between Xi 
and yj . A maximum weight matching of size n in G then provides an optimal (minimum 
distance) pairing of observations at the two times t and t + At. 

6. Crew scheduling: Bus drivers are hired to work two four-hour shifts each day. 
Union rules require a certain minimum amount of time between the shifts that a driver 
can work. There are also costs associated with getting the driver between the ending 
location of the first shift and the starting location of the second shift. 

The problem of optimally combining pairs of shifts that satisfy union regulations 
and incur minimum total cost can be formulated as a maximum weight matching prob- 
lem. Namely, define the network G with vertices representing each shift that must be 
covered and edges between pairs of compatible shifts (satisfying union regulations). The 
weight of edge (i.j) is the negative of the cost of assigning a single driver to shifts i 
and j. It is convenient also to add edges (i, i ) to G to represent the possibility of needing 
a part-time driver to cover a single shift; edge (i, i) is given a sufficiently large negative 
weight to discourage single-shift assignments unless absolutely necessary. 

A maximum weight matching in the network G then provides a minimum cost 
pairing of shifts for the bus drivers. 

7. Snowplowing streets: The streets of an area of a city are to be plowed by a single 
snowplow. Let G be the network representing the street system of the city, with vertices 
representing street intersections and edges representing streets. Associated with each 
street (i, j) is its length dj. 

If all vertices of G have even degree, then G is an Eulerian graph (§8.4.3) and a 
circuit that traverses each edge (street) exactly once can be found using the algorithms 
in §8.4.3. 

Otherwise, a closed walk of G that covers each street at least once is needed, and 
one with minimum total length dj is desired. Let N be the set of vertices of G 
having odd degree; by Fact 4 of §8.1.1, |7V| is an even integer 2k. Form the complete 
network H = ( N,E ) in which the weight of edge (i. j) is the negative of the shortest 
path distance (§10.3.1) between vertices i and j in G. Determine a maximum weight 
(perfect) matching M of size k in H . For each (i,j) in M, add the edges of the shortest 
path between i and j to the network G , forming the network G' . Every vertex of G' 
now has even degree, and an Euler circuit of G' provides the required minimum cost 
traversal of the city streets. 

This problem is known as the (undirected) Chinese postman problem. A directed 
version of the problem is discusses in §10.5.3, Application 4. 
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8. Additional applications, with reference sources, are given in the following table. 


application 

references 

medical residents assignment 

[AhMaOr93] 

school bus driver assignment 

[AhMaOr93] 

oil well drilling 

[LoPu86] , [AhMaOr93], [Ge95] 

chemical bonds 

[AhMaOr93] 

inventory depletion 

[AhMaOr93] 

scheduling on machines 

[LoPu86] , [AhMaOr93] 

ranks of matrices 

[AhMaOr93] 

doubly stochastic matrices 

[LoPu86] 

nonnegative matrices 

[LoPu86] 

basketball conference scheduling 

[EvMi92] 

major league umpire scheduling 

[EvMi92] 

project scheduling 

[Ge95] 

plotting street maps 

[Ge95] 


10.3 SHORTEST PATHS 

The shortest path problem requires finding paths of minimum cost (or length) from 
a specified source vertex to every other vertex in a directed network. Shortest path 
problems lie at the heart of network flows (§10.4-10.5). They are important both to 
researchers and to practitioners because: 

• they arise frequently in application settings where material is to be sent between 

specified points as quickly, as cheaply, or as reliably as possible; 

• they arise as subproblems when solving many combinatorial and network opti- 

mization problems; 

• they can be solved very efficiently. 


10.3.1 BASIC CONCEPTS 


Definitions: 

A directed network is a weighted graph G = (V,E), where V is the set of vertices 
and E is the set of arcs (directed edges). Each arc (i,j) € E has an associated cost (or 
weight , length ) Cij. It is possible that certain of the Cj 7 are negative. Let n = \V\ and 
to = \E\. 

The adjacency set A(i ) for vertex i is the set of all arcs incident from i, written 

A (i) = I (i,j) eE}. 

A directed path (§8.3.2) P has length £(j j)^pCij. 

A directed cycle (§8.3.2) W for which £(i,j)ew c ij < 0 is called a negative cycle. 
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A shortest path from vertex s to vertex j is a directed path from s to j having 
minimum length. 

A directed out-tree is a tree rooted at vertex s in which all arcs are directed away 
from the root s. 

A shortest path tree is an out-tree T* rooted at vertex s with the property that the 
directed path in T* from s to any other vertex j is a shortest s-j path. 

A vector d(j is called a vector of distance labels if for every vertex j £ V, d(j) is the 
length of some directed path from the source vertex s to vertex j , with d(s) = 0. If these 
labels are the lengths of shortest s-j paths, they are called shortest path distances. 

The directed path P = [*q, i\, ■ ■ ■ , i r ] from vertex i 0 to vertex i r can be represented 
using predecessor indices: pred(i \ ) = io,pred(i 2 ) = ii , . . . ,pred(i r ) = i r -i- 

Facts: 

1. Shortest paths are useful in a wide variety of applications, such as in efficient rout- 
ing of messages and distribution of goods, developing optimal investment strategies, 
scheduling personnel, and approximating piecewise linear functions (see §10.3.5). 

2. If P = [s, i i, . . . , i r ] is a shortest path from s to i r then Q = [s, i \, . . . , ik] is a shortest 
path from s to ik for each 1 < k < r. 

3. Shortest path optimality conditions: The vector d(-) of distance labels represents 
shortest path distances if and only if d(j ) < d(i) + Cij for all (i. j) £ E. 

4. If the network contains a negative cycle accessible from vertex s, then distance labels 
satisfying the conditions in Fact 3 do not exist. 

5. If the network does not contain any negative cycle, then (unique) distance labels 
satisfying the conditions in Fact 3 always exist. Furthermore, there is a shortest path 
tree T* realizing these shortest path distances. 

Examples: 

1. In the directed network of the following figure, arc costs are shown along each arc. 
Part (b) lists the nine paths from vertex 1 to vertex 6, together with their lengths. 
Path P4, with length 10, is the (unique) shortest path joining these two vertices. This 
path can be represented using the predecessor indices: pred{ 6) = 4, pred( 4) = 3, 
pred( 3) = 2, pred( 2) = 1. By Fact 2, the subpath Q = [1, 2, 3,4] of P4 is a shortest path 
from vertex 1 to vertex 4. 




Path 

Length 

Pi 

[1,2, 4,6] 

11 

p 2 

[1 , 2, 5, 6] 

14 

p 3 

[1,2, 5, 4,6] 

13 

P4 

[1,2, 3, 4,6] 

10 

P5 

[1,2, 3, 5, 6] 

13 

p 6 

[1,2, 3, 5, 4,6] 

12 

P7 

[1,3, 4,6] 

12 

P 8 

[1,3, 5, 4,6] 

14 

P9 

[1,3, 5,6] 

15 


(b) 
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2 . In the directed network of the following figure, arc costs are shown along each arc 
and a set of distance labels are shown at each vertex. Part (b) gives paths from vertex 
s = 1 whose lengths equal the corresponding distance labels. These distance labels do 
not satisfy the optimality conditions of Fact 3 because for the arc (3, 5), d( 5) > d(3)+C3 5 . 
The out-tree T in this figure defined by predecessor indices pred( 2) = 5, pred( 3) = 1, 
pred( 4) = 2 and pred( 5) = 3 has distance labels d = (0, 5, 5, 25, 0). It is a shortest path 
tree rooted at vertex 1 since the optimality conditions of Fact 3 are satisfied: namely 

5 < 0 + 10 for arc (1, 2), 5 < 5 + 10 for arc (2, 3), 0 < 25 + 15 for arc (4, 5). 

10 30 




15 


5 45 

(a) 




i 

dfl) 

path 

2 

10 

[1.2] 

3 

5 

[1,3] 

4 

30 

[1,2,4] 

5 

45 

[1,2, 4, 5] 


(b) 


1 0.3.2 ALGORITHMS FOR SINGLE-SOURCE SHORTEST PATHS 

This subsection discusses algorithms for finding shortest path trees from a given source 
vertex s in a directed network G with n vertices and m arcs. 

Facts: 

1. Label-correcting algorithm: A general label-correcting algorithm (Algorithm 1) is 
based on the shortest path optimality conditions (§10.3.1 Fact 3) and is a very popular 
algorithm to solve shortest path problems with arbitrary arc costs (L. R. Ford, 1956 and 
R. E. Bellman, 1958). 

2 . Algorithm 1 maintains a list, LIST, of vertices with the property that if an arc (i,j) 
violates the optimality condition, then LIST must contain vertex i. If LIST is empty, 
then the current distance labels are optimal. Otherwise some vertex i is removed 
from LIST and the arcs of A(i) are scanned. If an arc (i,j) £ A(i) violates the op- 
timality condition, then d(j) is updated appropriately. 

3 . When Algorithm 1 terminates, the nonzero predecessor indices define a shortest 
path tree T* rooted at the source vertex: namely, T* = { ( pred(i),i ) | i £ V — {s} }. 

4 . Convergence: In Algorithm 1, vertices in LIST can be selected in any order and 
the algorithm still converges finitely. If all arc costs are integers whose magnitudes are 
bounded by a constant C, then the algorithm performs 0(n 2 C) iterations and can be 
implemented to run in 0(nmC) time, regardless of the order in which vertices from 
LIST are selected. 

5 . Queue implementation: Suppose in Algorithm 1 that LIST is maintained as a queue 
(§17.1.2); that is, vertices in LIST are examined in a first-in-first-out (FIFO) order. This 
specific implementation examines no vertex more than n — 1 times and runs in 0{nm) 
time. This is the best strongly polynomial-time algorithm to solve the shortest path 
problem with arbitrary arc costs. 
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Algorithm 1 : Label-correcting algorithm. 

input: directed network G, source vertex s 
output: shortest path tree T* rooted at s 

d(s) := 0 
pred(s) := 0 

d(j) := oo for all j £ V — {s} 

LIST := {s} 

while LIST ^ 0 

remove a vertex i from LIST 
for each (i,j) £ A(i) 

if d(j) > d(i ) + Cij then 
d(j) ■■= d(i) + Cij 
pred(j) := i 

if j LIST then add j to LIST 


6. Dequeue implementation: Suppose in Algorithm 1 that LIST is maintained as a 
dequeue (§17.1.2). Specifically, vertices are removed from the front of the dequeue, but 
vertices are added either at the front or at the rear. If the vertex has been in LIST 
earlier, the algorithm adds it to the front; otherwise, it adds the vertex to the rear. 
Empirical studies have found that the dequeue implementation is one of the most effi- 
cient algorithms to solve the shortest path problem in practice even though it is not a 
polynomial-time algorithm. 

7 . Negative cycle detection: The queue implementation (Fact 5) of the label-correcting 
algorithm can be used to detect the presence of a negative cycle. To do so, record the 
number of times that the algorithm examines each vertex. If the algorithm examines 
a vertex more than n — 1 times, there must exist a negative cycle. In this case, the 
subgraph formed by the arcs ( pred(i),i ) will contain a negative cycle. 

8. A variety of computer codes (in Fortran) that implement the label-correcting algo- 
rithm for shortest paths can be found at the following sites: 

http : //www.netlib. org/toms/562 

ftp : //ftp . zib . de/pub/Packages/mathprog/netopt-bertsekas/ 
http : //www.mat .uc . pt/~eqvm/cientif icos/f ortran/codigos .html 
http : //www.neci .nj .nec . com/homepages/ avg/ soft/ soft .html 

9 . Dijkstra’s algorithm (1959): Dijkstra’s algorithm (Algorithm 2) is a popular al- 
gorithm for solving shortest path problems with nonnegative arc costs (E. W. Dijkstra, 
born 1930). 

10. Algorithm 2 performs two steps repeatedly: vertex selection and distance update. 
The vertex selection step chooses a vertex i with smallest distance label in LIST for 
examination. The distance update step scans each arc (i, j) £ A(i) and updates the 
distance label d(j), if necessary, to restore the optimality condition for arc (i,j). 

1 1 . Whenever a vertex is selected for examination in Algorithm 2, its distance label is 
the shortest path distance from s; consequently, each vertex is examined only once. 

12 . Using a simple array or linked list representation of LIST, vertex selections take a 
total of 0(n 2 ) time and distance updates take a total of O(m) time. This implementation 
of Algorithm 2 runs in 0{n 2 ) time. 
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Algorithm 2: Dijkstra’s algorithm. 

input: directed network G with > 0, source vertex s 
output: shortest path tree T* rooted at s 

d(s) := 0 
pred(s) := 0 

d(j) := oo for all j £ V — {s} 

LIST := V 
while LIST ^ 0 
{Vertex selection} 

let i £ LIST be a vertex for which d[i) = min{ d(j) \ j £ LIST } 
remove vertex i from LIST 

{Distance update} 
for each (i,j) £ A(i ) 
if d(j) > d(i) + Cij then 
d(j) '■= d(i) + cij 
pred(j) := i 


13. By using more sophisticated data structures, the efficiency of Dijkstra’s algorithm 
can be improved. Currently, two of the best implementations: 

• use Fibonacci heaps, giving 0{m + nlogn) running time [FrTa84]; 

• use radix heaps, giving 0(m . + n(log C) 1 / 2 ) running time [AhEtal90]. 

14. Empirically, the fastest implementation of Dijkstra’s algorithm is due to R. Dial 
[Di69], and it runs in 0(m + nC ) time. 

15. A comprehensive discussion of several implementations of Dijkstra’s algorithm and 
the label-correcting algorithm is presented in [AhMaOr93]. 

16. A variety of computer codes (in C, Pascal, and Fortran) that implement Dijkstra’s 
algorithm for shortest paths can be found at the following sites: 

ftp : //ftp . zib . de/pub/Packages/mathprog/netopt-bertsekas/ 
http : //www.mat ,uc . pt/~eqvm/cientif icos/f ortran/codigos .html 
http : //orly 1 . snu . ac . kr/ software/ 

http : //www.neci .nj .nec . com/homepages/ avg/ soft/ soft .html 

17. A useful extension of the shortest path problem involves finding the k shortest 
paths in a network. The case k = 1 corresponds to a shortest path. More generally, 
the kth shortest path is one having the fcth smallest length among all paths from s to t. 
Several algorithms for solving the problem of finding the k shortest paths are discussed 
in [EvMi92]. 


Examples: 

1. The following figure illustrates three iterations of the label-correcting algorithm 
applied to Example 2 of §10.3.1. LIST is maintained as a queue. 

In the first iteration, the source vertex s = 1 is examined, and the distance labels 
of vertices 2 and 3 are decreased to 10 and 5, respectively. At this point, LIST = [2, 3]. 
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In the second iteration, vertex 2 is removed from LIST and examined. The distance 
label of vertex 4 decreases to 30, while the distance label of vertex 3 remains unchanged, 
giving LIST = [3,4], 

Next, vertex 3 is removed from LIST and examined, triggering a reduction of the 
distance label of vertex 5 to 0. At this point the current out-tree, defined by the 
predecessor indices pred(-), consists of arcs (1,2), (2,4), (1,3), and (3,5). 


2. Part (b) of the following figure shows the application of Dijkstra’s algorithm to the 
directed network drawn in part (a) with nonnegative arc costs and s = 1. Shown at each 
iteration are the current distance labels, the vertex selected, and the resulting distance 
updates. Upon termination, the shortest path lengths d = (0,6,4, 9, 7) are realized by 
the optimal tree T* having arcs (1,3), (3,2), (2,4), and (2,5). 



iteration 

labels 

select 

updates 

1 

(0,00,00,00,00) 

1 

d(2) = 7, d(3) = 4 

2 

(0, 7,4,«o,“) 

3 

d(2) = min(7,6} = 6 

d(5) = 9 

3 

(0,6,4,“,9) 

2 

d(4) = 9 

d(5) = min(9,7) = 7 

4 

(0,6, 4, 9, 7) 

5 

d(3) = min{4,9) = 4 

5 

(0.6, 4, 9, 7) 

4 

d(5) = min(7,13) = 7 


(b) 
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1 0.3.3 ALGORITHMS FOR ALL-PAIRS SHORTEST PATHS 


This section discusses algorithms for finding shortest path distances between every pair 
of vertices in a directed network with n vertices and m arcs. 

Definitions: 

Suppose G = ( V . , E) is a directed network with vertex set V and arc set E, and let cy 
be the cost of arc (i. j) G E. 

The n x n arc length matrix U = (u t j) is defined as follows: 

TO if i = j 
Uij = < if (i,j) G E 

I oo if i 7^ j and (i, j ) ^ E. 

Let dij be the length of a shortest path from vertex i to vertex j, with du = 0. 

Define the n x n matrix D k = (d k j ), where d-j is the length of a shortest path from 
vertex i to vertex j subject to the condition that the path contains no more than k arcs. 

Define minsum matrix multiplication C A Q /i by c, :i = min 1 < p <„{oj p + b pl } . 
Also, define A® k = A <g> A <8 • • • <g> A (k times). 

In the directed path . . . , i r ] from i o to i r , the vertices , *2 , • • • , i r -i are called 

internal vertices. Let d k [i, j] be the length of a shortest path from vertex i to vertex j 
subject to the condition that this path uses only 1,2, ...,fe — 1 as internal vertices. 
The n x n matrix D^ k 1 contains the entries d k [i,j]. 

Facts: 

1. The length of a shortest path containing at most k arcs can be expressed in terms 
of shortest path lengths involving at most k — 1 arcs. Namely, for all vertices i and j 

• d lj = 11 '.r 

• cfe = min {d k ~ x + u p A for 2 < k < n - 1; 

3 1 <P<n P 

• if there is no negative cycle, then d "G 1 = dij. 

2. D k = U m for all 1 < k < n - 1. 

3. For any pair of vertices i and j, the following conditions hold: 

• dl [i, j] = ; 

• d k+1 [i,j] = min {d k [i,j],d k [i,k\ + d k [k,j]}, 1 < k < n; 

• if there is no negative cycle then d n+1 [i, j] = dij. 

4. The all-pairs shortest path problem can be solved by applying n times either Algo- 
rithm 1 or Algorithm 2 of §10.3.2, considering each vertex once as a source. 

5. Specialized algorithms are available to solve the all-pairs shortest path problem: the 
matrix multiplication algorithm (Fact 6) and the Floyd- Warshall algorithm (Fact 8). 

6. Matrix multiplication algorithm: This algorithm (Algorithm 3), based on Facts 1 
and 2, computes the shortest path distances between all vertex pairs by multiplying two 
matrices repeatedly, using minsum matrix multiplication. 

7. If there is no negative cycle, then Algorithm 3 finds all shortest path distances using 
O(logn) matrix multiplications, each of which takes 0(n 3 ) time. Hence this algorithm 
runs in 0(n 3 logn) time and requires 0(n 2 ) space. 
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Algorithm 3: Matrix multiplication algorithm. 

input: directed network G on n vertices 
output: shortest distance matrix D = ( dij ) 

form the n x n arc length matrix U 
compute D := 


Algorithm 4: Floyd- Warshall algorithm. 

input: directed network G on n vertices 
output: shortest distance matrix D = (d[i. j]) 

for all (i,j) £ V x V d[i. j] := oo 

for alii £ V d[i, i] := 0 

for all (i,j) £ E d[i,j] := Cij 

for k := 1 to n 

for (i,j) £ V x V 

if d[i,j\ > d[i,k ] +d[k,j] then d[i,j] := d[i,k] +d[k,j] 


8. Floyd-Warshall algorithm : This approach (Algorithm 4) to calculating all-pairs 
shortest path distances in a directed network G is based on computing conditional 
shortest path lengths d[i,j). 

9. If there is no negative cycle, the Floyd-Warshall algorithm correctly computes the 
matrix of shortest path distances. A single n x n array D is used to implement the 
algorithm. 

10 . Algorithm 4 can be used to detect (and identify) negative cycles by monitoring 
whenever d[i, i] < 0 occurs for some vertex i. 

11. Algorithm 4 runs in 0(n 3 ) time and requires 0(n 2 ) space. 

12. If the underlying network is dense, that is, ?n = fl(n 2 ), then the 0(n 3 ) time bound 
for Algorithm 4 is as good as any other discussed in §10.3.2 or §10.3.3. 

13. Algorithm 4 was first discovered by B.Roy in 1959 in the context of determining 
the transitive closure of a graph; this same algorithm was independently discovered by 
S. Warshall in 1962. The method was generalized to computing all shortest paths by 
R. W. Floyd, also in 1962. 

14. Computer codes (in C, Pascal, and Fortran) that implement the Floyd-Warshall 
algorithm can be found at the site: 

http : //orly 1 . snu . ac . kr/ software/ 

Examples: 

1. The matrix multiplication algorithm is applied to the directed network in the fol- 
lowing figure. 
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By Facts 1 and 2, the matrix D 4 is the matrix of shortest path distances. 
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2. Algorithm 4 is illustrated with the network of the figure of Example 1. 

/ 0 4 5 oo oo\ / 0 4 5 7 14 \ 

oo 0 6 3 10 I I oo 0 6 3 10 

= oo oo 0 4 oo D& = 

oo oo 0 4 oo 

oooo306 oooo306 

Voo oo oo 4 0 / Voo oo oo 4 0 / 

/O 4 5 7 13 \ 

oo 0 6 3 9 

£)[ 5 1 = oo oo 0 4 10 

oo oo 3 0 6 

Voo oo 7 4 0 / 

It can be verified that D ^ D M = I)! 3 ], and Consequently, the 

matrix D above gives all shortest path distances. 


10.3.4 PARALLEL ALGORITHMS 

Parallel implementations of certain shortest path algorithms are described are described 
here relative to an EREW (exclusive-read, exclusive-write) PRAM (parallel random- 
access machine). For details of EREW PRAM, see §16.2. 

Facts: 

1. Label-correcting algorithm: The parallel implementation of the label-correcting 
algorithm (§10.3.2) associates a processor with each arc and with each vertex of the 
network. This algorithm maintains a distance label for each vertex, appropriately ini- 
tialized. Suppose the distance labels are d(i ) at the beginning of an iteration. During 
the iteration, the processor attached to each arc (i,j) computes a temporary label 

= d(i) + Cij in 0(1) time. Then the processor associated with vertex j examines 
incoming arcs at vertex j and sets d(j) := min{ d'(i,j) \ 1 < i < n}. 

2. Using a parallel prefix operation, the distance labels can be updated in 0(log?i) 
time. The label-correcting algorithm performs O(n) iterations and so its running time 
is 0(n log n) using 0(m) processors. 

3. Matrix multiplication algorithm : The matrix multiplication algorithm of §10.3.3 
solves the all-pairs shortest path problem by performing 0(log n) matrix multiplications. 

4. Unlike a sequential computer, where matrix multiplication takes 0(n 3 ) time, a par- 
allel computer can perform matrix multiplication in 0(log n) time using 0(n 3 ) proces- 
sors [Le92]. Consequently, this all-pairs shortest path algorithm runs in 0(log 2 rc) time 
using 0(n 3 ) processors. 
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10.3.5 APPLICATIONS 

Shortest path problems arise in a variety of applications, both as stand-alone models 
and as subproblems in more complex problem settings. Shortest path problems also 
arise in surprising ways that on the surface might not appear to involve networks at all. 
This subsection presents several models based on determining shortest paths. 

Applications: 

1. Distribution: Material needs to be shipped by truck from a central warehouse 
to various retailers at minimum cost. The underlying network is an undirected road 
network, with edges representing the roads joining various cities (vertices). The cost of 
an edge is the per unit shipping cost. Solving the single-source shortest path problem 
provides a least-cost shipping pattern for the material. 

2. Telephone routing: A call is to be routed from a specified origin to a specified 
destination. Here the underlying network is the telephone system, with vertices repre- 
senting individual users (or switching centers). Since a direct connection between the 
origin vertex s and the destination vertex t may not be available, one practice is to route 
the call along a path having the minimum number of arcs (i.e., involving the smallest 
number of switching centers). This means finding a shortest path with unit lengths on 
all arcs. Alternatively, each arc can be provided with a measure of delay, and routing 
can take place along a timewise shortest path from s to t. 

3. Salesperson routing: A salesperson is to travel by air from city A to city B. The 

commission obtained by visiting each city along the way can be estimated. An optimal 
itinerary can be found by solving a shortest path problem on the underlying airline 
network, represented as a directed network of nonstop routes (arcs) connecting cities 
(vertices). Each arc ( i,j ) is given the net cost — ry, where fy is the cost of 

the flight from city i to city j and ry is the commission obtained by visiting city j. A 
shortest path from A to B identifies an optimal itinerary. 

4. Investment strategy: An investor has a fixed amount to invest at the beginning of 
the year. A variety of different financial opportunities are available for investing during 
the year, with each such opportunity assumed to be available only at the start of each 
month. Construct the directed network having a vertex for each month as well as a 
final vertex t = 13. The arc (i,j) corresponds to an investment opportunity beginning 
in month i and maturing at the start of month j, with its weight Cij being the negative 
of the profit earned for the duration of the investment. An optimal investment strategy 
is identified by a shortest path from vertex 1 to vertex t. 

5. Equipment replacement : A job shop must periodically replace its capital equipment 
because of machine wear. As the machine ages, it breaks down more frequently and so 
becomes more expensive to operate. Also, as a machine ages its salvage value decreases. 
Let Cij denote the cost of buying a particularly important machine at the beginning of 
period i, plus the cost of operating the machine over the periods i, i + 1 , . . . , j — 1 , minus 
the salvage cost of the machine at the beginning of period j. The problem is to design a 
replacement plan that minimizes the cost of buying, selling, and operating the machine 
over a planning horizon of n years, assuming that the job shop must have exactly one 
machine in service at all times. 

This problem can be formulated as a shortest path problem on a network G with 
vertices i = 1, 2, . . . , n + 1; G contains an arc ( i , j) with cost c,y for all i < j. There is a 
one-to-one correspondence between directed paths in G from vertex 1 to vertex n+ 1 and 
equipment replacement plans. The following figure gives a sample network with n = 5. 
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The path [1,3,6] corresponds to buying the equipment at the beginning of periods 1 
and 3. A shortest path from vertex 1 to vertex n + 1 identifies an optimal replacement 
plan. 



6 . Paragraph problem: The document processing program Tj^X uses an optimization 
procedure to decompose a paragraph into several lines so that when lines are left- and 
right-justified, the appearance of the paragraph will be the most attractive. Suppose 
that a paragraph consists of words i = 1, 2, . . . , n. Let Cij denote the attractiveness of 
a line if it begins with the word i and ends with the word j — 1. The program Tj^X 
uses formulas to compute the value of each c^. Given the c^, the decision problem is 
to decompose the paragraph into several lines of text in order to maximize the total 
attractiveness (of all lines) . This problem can be formulated as a shortest path problem 
in a manner similar to Application 5. 

7 . Tramp steamer problem: A ship travels from port to port carrying cargo and 
passengers. A voyage of the steamer from port i to port j earns Pij units of profit and 
requires © > 0 units of time. Here it is assumed that S(i j)ew © > 0 f° r every directed 
cycle W in G. The captain of the ship would like to know whether there exists a tour 
(directed cycle) W for which the daily profit is greater than a specified threshold po ; that 
is - T,(i,j)ewPii/T l (i,j)ew t H > Mo- By writing this inequality as T,(i,j)ew(PoUj-Pij) < 
0, it is seen that there is a tour W with mean daily profit exceeding po if and only if G 
contains a negative cost directed cycle W. The shortest path label-correcting algorithm 
can be used to detect the presence (or absence) of negative cycles (see §10.3.2, Fact 7). 

8 . System of difference constraints: In some linear programming applications (§15.1) 
with constraints of the form Ax < b, the in x n constraint matrix A contains one +1 
and one —1 in each row, with all other entries being zero. Suppose that the fcth row 
has a +1 entry in column and a —1 entry in column if entries in the vector b have 
arbitrary signs. This linear program defines the following set of m difference constraints 
in n variables x = (cc(l), x(2), . . . , x(n)): x(jk) — x(ik) < b(k) for each k = 1,2, ... ,m. 
The problem is to determine whether this system of difference constraints has a feasible 
solution, and if so, to obtain one. 

Associate a graph G with this system of difference constraints; G has n vertices 
corresponding to the n variables, and the arc ( ik,jk ) of length b(k) results from the 
constraint x(jk) — x{ik) < b(k). These constraints are identical with the optimality 
conditions for the shortest path problem in G, and they can be satisfied if and only if G 
contains no negative cycle. In this case the shortest path distances give a solution x 
satisfying the constraints. 

9 . Examples of Application 8 occur in telephone operator scheduling, just-in-time 
scheduling, analyzing the consistency of measurements, and the scaling of data. 

10 . Maximin paths: In a network with capacities (that is, upper bounds on the amount 
of material that can be sent on each arc), the capacity of a path is the smallest capacity 
on any of its constituent arcs. A common problem in such networks is to find a path 
from vertex s to vertex t having the maximum capacity. This represents a path along 
which the maximum amount of material can flow. Such a maximin path can be found 
efficiently by adapting Dijkstra’s shortest path algorithm. 
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11 . Additional applications, with reference sources, are given in the following table. 


application 

references 

approximating piecewise linear functions 

[AhMaOr93], [AhEtal95] 

DNA sequence alignment 

[AhEtal95] 

molecular confirmation 

[AhMaOr93], [AhEtal95] 

robot design 

[AhMaOr93], [AhEtal95] 

scaling of matrices 

[AhMaOr93], [AhEtal95] 

knapsack problems 

[EvMi92], [AhMaOr93] , [AhEtal95] 

compact book storage 

[AhMaOr93] , [AhEtal95] 

personnel planning 

[AhMaOr93] 

routing snow removal vehicles 

[EvMi92] 

production lot sizing 

[EvMi92] 

transportation planning 

[EvMi92] 

single-crew scheduling 

[AhMaOr93] 

dynamic facility location 

[AhMaOr93] 


10.4 MAXIMUM FLOWS 


The maximum flow problem involves sending the maximum amount of material from a 
specified source vertex s to another specified sink vertex t, subject to capacity restric- 
tions on the amount of material that can flow along each arc. A closely related problem 
is the minimum cut problem, which is to find a set of arcs with smallest total capacity 
whose removal separates s and t. 


10.4.1 BASIC CONCEPTS 
Definitions: 

Let G = ( V,E ) be a directed network with vertex set V and arc set E (see §10.3.1). 
Each arc (i,j) £ E has an associated capacity Uij > 0. Such a network is called a 
capacitated network. Let n = \V\ and m = \E\. 

Suppose s is a specified source vertex and t is a specified sink vertex. A ( feasible ) 
flow is a function x = ( ly ) defined on arcs (i,j) £ E satisfying: 

• mass balance constraints : Y x. t j = Y Xji for all i £ V — {s,t}; 

• capacity constraints : 0 < Xjj < u t j for all (i,j) £ E. 

The arc (i,j) is saturated in flow x if Xij = u. t j . 

The value of flow x is v = Yh x sj, the total flow leaving the source vertex. 

{il(s,i)es} 

A maximum flow is a flow having maximum value. 
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A cut [5, 5] partitions the vertex set V into two subsets S and S = V — S, and consists 
of all arcs with one endpoint in S and the other in .S'. Arcs directed from S to S are 
forward arcs , and the set of forward arcs is denoted by ( S, S') . Arcs directed from S 
to S are backward arcs , and the set of backward arcs is denoted by ( S , S ). 

The cut [S', S] is an s-t cut if s € S and t £ S. The capacity of the s-t cut [S, S] is 

u[S,S]= Y u ij • 

(i,j)e(s,s) 

A minimum cut is an s-t cut having minimum capacity. 

Facts: 

1. The flow Xij on arc ( i,j ) can represent the number of cars (per hour) traveling 
along a highway segment, the rate at which oil is pumped through a section of pipe in 
a distribution system, or the number of messages per unit time that can be sent along 
a data link in a communication system. 

2. The mass balance constraints ensure that for all vertices i (other than the source or 
sink), the total flow out of i equals the total flow into i. 

3. The capacity constraints ensure that the flow on an arc does not exceed its stated 
capacity. 

4. Maximum flows arise in a variety of practical problems involving the flow of goods, 
vehicles, and messages in a network. Maximum flows can also be used to study the 
connectivity of graphs, the covering of chessboards, the selection of representatives, 
winning records in tournaments, matrix rounding, and staff scheduling (see §10.4.3). 

5. For any s-t flow x , the flow out of s equals the flow into t: that is, 

Y x sj = v = Y x jt- 

{j|(sd)eE} 01(1, t)eE} 

6. Removal of the arcs in the s-t cut Z = [.S', S] from G separates vertex s from vertex t: 
namely, there is no s-t path in G — Z. 

7 . Let [S', S] be any s-t cut in the network. Then the value of the flow x is given by 

v= Y 3 ? - Y_ x ii- 
(■ i,j)e(s,s ) (j,i)e(s,s) 

That is, the net flow across each s-t cut is the same and equals v. 

8. Weak duality theorem : The value of every s-t flow is less than or equal to the 
capacity of every s-t cut in the network. 

9. If x is some s-t flow whose value equals the capacity of some s-t cut [S, S], then x 
is a maximum flow and [S, S] is a minimum cut. 

10 . Max-flow min-cut theorem : The maximum value of the flow from vertex s to 
vertex t in a capacitated network equals the minimum capacity among all s-t cuts. 
(L. R. Ford and D. R. Fulkerson, 1956.) 

11. A systematic study [FoFu62] of flows in networks was first carried out by Ford and 
Fulkerson, motivated by a simplified model of railway traffic flow. 

Examples: 

1 . Part (a) of the following figure shows a flow network with s = 1 and t = 6; capacities 
are indicated along each arc. The function x given in part (b) satisfies the mass balance 
constraints and the capacity constraints, and hence is a feasible flow. Relative to this 
flow, arc (1,2) is not saturated since X \2 = 6 < 7 = Uy 2 \ on the other hand, arc (3,5) 
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10.4.2 


is saturated since X 35 = 3 = U 35 . The flow has value v = X 12 + £13 = 6 + 2 = 8. 
Here the flow into vertex 6 is £46 + a+6 = 5 + 3 = 8 = v, as guaranteed by Fact 5. 
The flow value across the s-t cut [S,S] with S = {1,3} is X 12 + £34 + £35 — X 23 = 
6 + 3 + 3 — 4 = 8. Similarly, the flow value across the s-t cut [ S , S] with S = {1, 2, 3, 5} 
is X 24 + £34 + X 56 — X 45 = 2 + 3 + 3 — 0 = 8. (See Fact 7.) This flow is not, however, a 
maximum flow. 


s 




2. In Example 1, the s-t cut [S, S'] with S = {1, 2, 3, 5} has capacity U 24 + U 34 + = 

5 + 3 + 9 = 17. Thus the value of any flow in the network is bounded above (see Fact 8) 
by 17. The s-t cut [S, S] with S = {1, 3} has capacity U \ 2 + U34 + U 33 = 7 + 3 + 3 = 13. 
This cut capacity 13 provides an improved upper bound on the value of a flow. In 
particular, the flow defined in part (b) has value v = 8 < 13. 

3. The following figure shows another feasible flow x' in the network of Example 1. 
For x' , the flow value across the s-t cut [S, S] with S = {1, 2, 3} is v = £24 + £34 + £35 = 
5+3+3 = 11, which equals the s-t cut capacity u[S, S] = U24 + U34 + U35 = 5+3+3 = 11. 
By Fact 9, 2/ is a maximum flow and S = {1, 2, 3} defines a minimum cut [S, S]. 



ALGORITHMS FOR MAXIMUM FLOWS 


There are two main classes of maximum flow algorithms: augmenting path algorithms 
and prcflow-push algorithms. Both types of algorithms work on an auxiliary network 
(called the residual network) associated with the current solution. 

Definitions: 

Let G = (V,E) be a directed network with n = \V\ and m = \E\. Let s and t be the 
specified source and sink, and let U be the largest of the arc capacities Uij in G. 

Let x = (xij) be a function defined on the arcs (i,j) of G. Relative to x, the outflow 
from vertex i and inflow to vertex i are given, respectively, by 

out(i) = J2 x ij 1 * n (*) = Z) x fi- 

{j \(j,i)£E} 

The excess of vertex i is e(i ) = in(i) — out{i). 
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Algorithm 1 : Augmenting path algorithm. 

input: directed network G , source vertex s, sink vertex t 
output: maximum flow x 

x := 0 

while G(x) contains a directed path from s to t 
identify an augmenting path P in G(x) 

6 := min {r^ | (i,j) £ P} 

augment S units of flow along P and update G(x) 
recover an optimal flow x from the final residual network G(x) 


A preflow is any x = ( Xij ) satisfying 

• relaxed mass balance constraints : e(i) > 0 for all i £V — {s,f}; 

• capacity constraints: 0 < < u 2 j for all (i,j) £ E. 

Vertex i is active if e(i) > 0. 

Given a flow (or a preflow) x, the residual capacity r 2 j of the arc (i,j) £ E is the 
maximum additional flow that can be sent from i to j using arcs (i. j) and (j, i). 

The residual network G(x) with respect to flow x consists of those arcs of G having 
positive residual capacity. 

An augmenting path is a directed path from vertex s to vertex t in G(x). 

The capacity of a directed path is the minimum arc capacity appearing on the path. 

A set of distance labels with respect to a preflow (or flow) a: is a function d: V — > 
{0, 1 , 2 ,.. .} satisfying 

• d(t) = 0; 

• d{i) < d(j) + 1 for every arc ( i,j ) in the residual network G(x). 

An arc (i,j) in the residual network G(x) is admissible with respect to the distance 
labels d (- ) if d(i) = d(j) + 1. 

Facts: 

1. The maximum flow problem on an undirected network can be converted to a maxi- 

mum flow problem on a directed network. Namely, replace every undirected edge (i,j) 
of capacity U{ 3 by two oppositely directed arcs (i,j) and each with capacity u^j. 

2. The residual capacity r i3 = {u 3 j — x 3 j) + Xji . The first term — x i3 represents 
the unused capacity of arc (i,j); the second term Xj 2 represents the amount of flow on 
arc (j. i) that can be canceled to increase flow from vertex i to vertex j . 

3. The capacity of an augmenting path is always positive. 

4. Augmenting path property: A flow a; is a maximum flow if and only if the residual 
network G(x) contains no augmenting path. 

5. Augmenting path algorithm: A general augmenting path algorithm (Algorithm 1) 
is based on Fact 4. It identifies augmenting paths and sends flows on these paths until 
the residual network contains no such path. 

6. Integrality property: For networks with integer capacities, Algorithm 1 starts with 
the zero flow and augments by an integral flow at each iteration. Hence the maximum 
flow problem with integral capacities always has an optimal integer flow. 

7. An augmenting path in G{ x) can be identified by any search procedure that starts 
at vertex s and identifies all vertices reachable from s by directed paths (§9.2.1). 
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Algorithm 2: Preflow-push algorithm. 

input: directed network G , source vertex s, sink vertex t 
output: maximum flow x 

compute the shortest path lengths d(-) to vertex t 
d(s) := n; x := 0; x S j := u S j for all arcs ( s,j ) £ E 
while the network contains an active vertex 
select an active vertex i and push_relabel(i) 
recover an optimal flow x from the final residual network G(x) 

procedure push_relabel(i ) 
if the network contains an admissible arc (i. j) then 
push <5 := min{e('i) , ry } units of flow from i to j 
else d(i) := min{ d(j) + 1 | (i, j) £ E and > 0 } 


8. Augmenting the flow along P by S decreases the residual capacities of arcs in P by S 
and increases the residual capacities of the reversals of arcs in P by S. 

9. At the last iteration of Algorithm 1, let S be the set of vertices reachable from s. 
Then t £ S and [S, S'] is a minimum cut. 

10 . Upon termination of Algorithm 1, an optimal flow x can be reconstructed from the 
final G(x) using Fact 2. Specifically, let ( i,j ) £ E. If Uij—rij > 0 then set Xij = Uij — rij 
and Xji = 0; otherwise, set x ji = rij — Uij and = 0. 

11. Algorithm 1 was independently discovered by L. R. Ford and D. R. Fulkerson (1956) 
and by P. Elias, A. Feinstein, and C.E. Shannon (1956). 

12. The distance label d(i) is a lower bound on the length (number of arcs) of the 
shortest (directed) path from vertex i to vertex t in the residual network. 

13. If some vertex j satisfies d(j) > n, then vertex j is separated from the sink vertex 
in the residual network. 

14. Algorithm 1 runs in pseudopolynomial time 0(nmU) for networks with integer (or 
rational) arc capacities. The algorithm may not terminate finitely for networks with 
irrational capacities. 

15. Two specific implementations of Algorithm 1 run in polynomial time: 

• by augmenting flow along a shortest path, the number of augmentations can be 

reduced to 0(nm), and using very sophisticated data structures this algorithm 
can be implemented to run in 0(nm log n) time; 

• by augmenting flow along a path with maximum residual capacity, the number 

of augmentations is 0(m log U) and this algorithm can be implemented to run 
in O (nm log U) time. 

16. Preflow-push algorithm: The preflow-push algorithm (Algorithm 2) maintains a 
preflow at every step and pushes flow on individual arcs instead of along augmenting 
paths. The basic operation is to select an active vertex and try to remove its excess by 
pushing flow to neighbors that are “closer” to the sink. 

17. The shortest path lengths calculated in Algorithm 2 represent the minimum num- 
ber of arcs in a path to vertex t and can be efficiently found by carrying out a breadth- 
first search relative to t (§9.2.1). 

18. In Algorithm 2, if the active vertex currently being examined has an admissible 
arc (i,j), then increasing the flow on (i. j) by 6 decreases by 6 and increases rji by S. 
Also, e(i) is decreased by 5 and e(j) is increased by S. 


© 2000 by CRC Press LLC 




19 . In Algorithm 2, if the active vertex currently being examined has no admissible 
arc, then after its distance label is increased, at least one admissible arc is created. 

20. The preflow-push algorithm can be implemented to run in 0(nrm) time. Variations 
of this algorithm with improved worst-case complexity are described in [AlrOrTa89] . 

21 . The highest-label preflow-push algorithm [GoTa86] is a specific implementation 
of Algorithm 2 that always examines vertices with the largest distance label. This 
0(?r 2 -©m) implementation is currently the fastest algorithm to solve the maximum flow 
problem in practice. 

22. Algorithm 2 can be implemented to run in 0(nmlog(n 2 /to)) time using a dynamic 
tree data structure. This algorithm currently achieves the best strongly polynomial-time 
bound to solve the maximum flow problem, but is not as efficient in practice as its more 
straightforward implementation. 

23 . The books [AhMaOr93] and [CoLeRi90] discuss additional versions of augment- 
ing and preflow-push algorithms, as well as specializations of these algorithms to unit 
capacity networks, bipartite networks, and planar networks. 

24 . Preflow-push algorithms are more general, more powerful, and more flexible than 
augmenting path algorithms for solving the maximum flow problem. 

25 . The best preflow-push algorithms currently outperform the best augmenting path 
algorithms in theory as well as in practice. 

26 . Computer codes (in C, Pascal, and Fortran) for solving maximum flow and mini- 
mum cut problems can be found at the sites: 

ftp : //dimacs . rutgers . edu/pub/netf low/maxf low/ 

ftp : //ftp . zib . de/pub/Packages/mathprog/netopt-bertsekas/ 

http : //www.neci .nj . nec . com/homepages/ avg/soft/ soft .html 

http : //orly 1 . snu . ac . kr/ software/ 

ftp://ftp.zib. de/pub/Packages/mathprog/mincut/ 

Examples: 

1. Part (a) of the following figure illustrates a network G with capacities shown next 
to each arc. 
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A feasible flow from vertex s = 1 to vertex t = 4 is displayed in part (b); this flow has 
value v = X 12 + X 13 = 3. Every path in G from s to t contains a saturated arc: paths 
[1,2,4] and [1,2, 3, 4] have the saturated arc (1,2), and path [1,3,4] has the saturated 
arc (3,4). Consequently, no additional flow can be pushed in the “forward” direction 
from s to t. Yet, the current flow x is not a maximum flow. 

To find additional flow from s to t, the residual network G(x) is constructed; see 
part (c) of the figure. An augmenting path in part (c) is P = [1,3, 2,4] with (residual) 
capacity <5=1. Adding the flow on P to that in part (b) produces the new flow x' in 
part (d); notice that the flow on arc (2,3) in x has been canceled in this process. The 
resulting flow x' has flow value v = 4. Since the s-t cut [5,5] with 5 = {1,2,3} has 
capacity W 24 + U 34 = 4 = v, the flow x' is a maximum flow and 5 = {1, 2, 3} defines a 
cut having minimum capacity. 

2 . The following figure illustrates three iterations of the augmenting path algorithm 
(Algorithm 1). 






(c) (d) 

Part (a) of the figure shows a network with capacities indicated on each arc. Here s = 1 
and t = 6. Initially the flow x = 0, so the residual network is identical to the original 
network with rij = Uij for every arc (i, j). 

Suppose that the algorithm identifies path P 1 = [1, 2, 4, 6] as the augmenting path. 
The algorithm augments 5 = min{ri 2 , r 24 , r 46 } = min{7, 5, 6} = 5 units of flow along P 1 . 
This augmentation changes the residual capacities only of arcs in P 1 (or their reverse 
arcs), yielding the new residual network in part (b). 

In the second iteration, suppose the algorithm identifies path P 2 = [1, 3, 5, 6] as the 
next augmenting path. Then flow is increased by S = min{8,3,9} = 3 units along P 2 ; 
part (c) shows the residual network after the second augmentation. 

A third augmentation with (5 = 1 occurs along path P 3 = [1,3,4, 6] in part (c), 
giving the residual network shown in part (d). 

3 . The following figure illustrates three iterations of the preflow-push algorithm on the 
flow network with capacities given in part (a). Here s = 1 and t = 4; in addition, 
the pair is shown beside each vertex i. Part (b) of the figure gives G(x) 

corresponding to the initial preflow with a: 12 = 2 and X 13 = 4. 
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(4,1) 



( 0 , 1 ) 



(5,0) 


(c) 


(d) 


Suppose that the algorithm selects vertex 2 for the push/relabel operation. Then 
arc (2,4) is the only admissible arc and the algorithm pushes 5 = min{e(2), r 24 } = 
min{ 2 , 1 } = 1 unit along this arc; part (c) gives the residual network at this stage. 

Suppose that the algorithm again selects vertex 2. Since no admissible arc emanates 
from this vertex, the algorithm performs a relabel operation and gives vertex 2 a new 
distance label: d( 2) = min{d(3) + l,d(l) + 1} = min{2,5} = 2. The new residual 
network is the same as the one shown in part (c) except that d( 2 ) = 2 instead of 1 . 

In the third iteration, suppose that vertex 3 is selected. Then 6 = min{e(3), r 34 } = 
min{4, 5} = 4 units are pushed along the arc (3,4); part (d) gives the residual network 
at the end of this iteration. 


10.4.3 APPLICATIONS 

A variety of applied problems can be modeled using maximum flows or minimum cuts. 
The max-flow min-cut theorem (§10.4.1, Fact 10) can also be used to deduce a number 
of min-max duality results in combinatorial theory. This section discusses a number of 
such applications. 

Applications: 

1. Distribution network: Oil needs to be shipped from a refinery to a storage facility 
using the pipelines of an underlying distribution network. Here the refinery corresponds 
to a particular vertex s in the distribution network, and the storage facility corresponds 
to another vertex t. The capacity of each arc is the maximum amount of oil per unit 
time that can flow along it. The maximum flow rate from the source vertex s to the 
sink vertex t is determined by the value of a maximum s-t flow. 
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2. Other examples of Application 1 occur in transportation networks, electrical power 
networks, and telecommunication networks. 

3. System of distinct representatives : Given is a collection of sets Xx,X 2 , . . . , X m 
which are subsets of a given n-set X. A system of distinct representatives (§1.2.2) for 
the collection is sought, if one exists. 

To solve this problem, set up the bipartite network (V) U V 2 , E) in which there is a 
vertex of V\ for each set X, and a vertex of V 2 for each element of X. An arc (i. j) of 
infinite capacity joins i £ V\ to j £ V 2 if j £ X, . Add a source vertex s joined by arcs of 
unit capacity to each * £ Vi, and a sink vertex t with arcs of unit capacity joining each 
j £ V 2 to t. Then a system of distinct representatives exists if and only if the maximum 
flow in this constructed network has value m. In this case, those arcs with i £ V\ 
and j £ V 2 , having flow Xij = 1 identify a system of distinct representatives selected 
from the m sets. 

4. Feasible flow problem: This problem involves finding a flow x in G = (V, E ) so that 

the net flow at each vertex is a specified value b(i), where = 0. That is, a 

flow x on the arcs of network G is required, satisfying: 

• mass balance constraints: x ij ~ Y x ji = &(*) f° r all * € V; 

{ j\(j,i)eE} 

• capacity constraints: 0 < < Uij for all (i,j) € E. 

This can be modeled as a maximum flow problem. Construct the augmented net- 
work G' by adding a source vertex s and a sink vertex t to G. For each vertex i with 
b(i) > 0, an arc (s, i) is added to E with capacity b(i); for each vertex i with b(i) < 0, 
an arc (i,t) is added to E with capacity —b(i). Then solve a maximum flow problem 
from vertex s to vertex t in G' . It can be proved that the feasible flow problem for G 
has a solution if and only if the maximum flow in G' saturates all arcs emanating from 
vertex s in G' . 

5. Application 4 frequently arises in distribution problems. For example, a known 
amount of merchandise is available at certain ports and is required at other ports in 
known quantities. Also the maximum quantity of merchandise that can be shipped on 
a particular sea route is specified. Determining whether it is possible to satisfy all of 
the demands by using the available supplies is a feasible flow problem. 

6. Graph connectivity: In a directed graph G, the arc connectivity k' 7J of vertices i 
and j is the minimum number of arcs whose removal from G leaves no directed path 
from i to j. The arc connectivity k' (G) is the minimum number of arcs whose removal 
from G separates some pair of vertices (see §8.4.2). The arc connectivity of a graph is 
an important measure of the graph’s reliability or stability. Since «/ (G) = min{ k,F | 
(i,j) £ b x V, i j }, the arc connectivity of a graph can be computed by determining 
the arc connectivity of n{n— 1) pairs of vertices. As a matter of fact, the arc connectivity 
of G can be found by determining only n — 1 arc connectivities. 

The arc connectivity kF can be found by applying the max-flow min-cut theorem 
(§10.4.1) to the network obtained from G by setting the capacity of each arc (i,j) to 1. In 
such a unit capacity network, the maximum i-j flow value equals the maximum number 
of arc-disjoint paths from vertex i to vertex j, and the minimum i-j cut capacity equals 
the minimum number of arcs required to separate vertex i and vertex j. This shows 
that the maximum number of arc-disjoint paths from vertex i to vertex j equals the 
minimum number of arcs whose removal disconnects all paths from vertex i to vertex j . 
(This result is a variation of Menger’s theorem in §8.4.2; it was independently discovered 
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by Ford and Fulkerson and by Elias, Feinstein, and Shannon.) Consequently, kC equals 
the maximum i-j flow value in the network, and the arc connectivity k'(G) can be 
determined by solving n — 1 maximum flow problems in a unit capacity network. 

7. Tournaments: Consider a round-robin tournament between n teams, assuming each 
team plays against every other team c times and no game ends in a draw. It is claimed 
that on for 1 < i < n is the number of victories accrued by the zth team at the end of 
the tournament. Verifying whether the nonnegative integers cq, 02 , . . . , a„ are possible 
winning records for the n teams can be modeled as a feasible flow problem. 

Define a directed network G = (V, E) with vertex set V = {1,2 ,nj and arc set 
E ={ (i, j) € V x V \ i < j }. Let Xfj, i < j, represent the number of times team i 
defeats team j. The total number of times team i defeats teams i + 1, i + 2, . . . , n is 
j)&E} x ij- Since the number of times team i defeats a team j < i is c — Xji, 
it follows that the total number of times that team i defeats teams 1,2,...,* — 1 is 
(*- l)c-E eE} x ji- However, there are two constraints: 

• the total number of wins a, of team i must equal the total number of times it 

defeats teams 1, 2, . . . , n, giving 

E x ij ~ E x ji — a i ~ (* — l) c for all? € V] 

• a possible winning record must also satisfy 

0 < Xij < c for all (i,j) G E. 

Consequently, {a© define a possible winning record if these two constraints have a 
feasible solution x. Let b(i) = at — (i — 1 )c. Since Eiey a i anc ^ E iev (* — l)c are 
both equal to ; the total number of games played, it follows that E;ev &(*) = 0. 

The problem of finding a feasible solution to the two constraints is then a feasible flow 
problem. 

8. Matchings and covers: The max-flow min-cut theorem can also be used to prove 
a min-max result concerning matchings and covers in a directed bipartite graph G = 
(Vi U V 2 ,E). (See §8.1.3.) The subset E' C E is a matching (§10.2.1) if no two arcs 
in E' are incident with the same vertex. The subset V’ C Vi U V 2 is a vertex cover if 
every arc in E is incident to at least one vertex in V' . Create the network G' from G 
by adding vertices s and t, as well as arcs (s,i) with capacity 1 for all i € V\ and 
arcs (j. t) with capacity 1 for all j £ V 2 . All other arcs of G correspond to arcs 
of G and have infinite capacity. Then each matching of cardinality v defines a flow of 
value v in G , and each s-t cut of capacity v induces a corresponding vertex cover with v 
vertices. Application of the max-flow min-cut theorem establishes the desired result: In 
a bipartite graph G — (Vj U V 2 ,E), the maximum cardinality of any matching equals 
the minimum cardinality of any vertex cover of G. 

9. 0-1 matrices : Suppose A = (fly) is a 0-1 matrix. Associate with A the directed 
bipartite graph G = (Vi U V 2 ,E), where V\ is the set of row indices and V 2 is the set 
of column indices. There is an arc (i,j) G E whenever = 1. A matching in G now 
corresponds to a set of “independent” Is in the matrix A: i.e., no two of these Is are 
in the same row or the same column. Also, a vertex cover of G corresponds to a set of 
rows and columns in A that collectively cover all the Is in the matrix. Applying the 
result in Application 8 shows that the maximum number of independent Is in A equals 
the minimum number of lines (rows and/or columns) needed to cover all the Is in A. 
This result is known as Konig’s theorem (§6.6.1). 
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10 . Additional applications, with reference sources, are given in the following table. 


application 

references 

matrix rounding 

[AhMaOr93] , [AhEtal95] 

distributed computing 

[AhMaOr93] , [AhEtal95] 

network reliability 

[AhMaOr93] , [AhEtal95] 

open pit mining 

[AhMaOr93] , [AhEtal95] 

building evacuation 

[AhMaOr93] 

covering sports events 

[AhEtal95] 

nurse staff scheduling 

[AhMaOr93] , [AhEtal95] 

bus scheduling 

[AhEtal95] 

machine scheduling 

[AhMaOr93] , [AhEtal95] 

tanker scheduling 

[AhMaOr93] , [AhEtal95] 

bottleneck assignment 

[FoFu62] 

selecting freight-handling terminals 

[AhEtal95] 

site selection 

[EvMi92] 

material-handling systems 

[EvMi92] 

decompositions of partial orders 

[FoFu62] 

matrices with prescribed row/column sums 

[FoFu62] 


10.5 MINIMUM COST FLOWS 


The minimum cost flow problem involves determining the least cost shipment of a 
commodity through a capacitated network in order to satisfy demands at certain vertices 
using supplies available at other vertices. This problem generalizes both the shortest 
path problem (§10.3) and the maximum flow problem (§10.4). 


10.5.1 BASIC CONCEPTS 
Definitions: 

Let G = ( V,E ) be a directed network with vertex set V and arc set E (see §10.3.1). 
Each arc (i. j) £ E has an associated cost Cij and a capacity Uij > 0. Let n = \V\ and 
to = \E\. 

Each vertex i £ V has an associated supply/demand b(i). If b(i) > 0, then vertex * is a 
supply vertex; if b(i ) < 0, then vertex i is a demand vertex. 

A ( feasible ) How is a function x = ( Xij ) defined on arcs (i, j) £ E satisfying: 

• mass balance constraints: x^ — ^ Xji = b(i) for all i £ V, 

• capacity constraints: 0 < Xij < for all (i. j) € E, 

where &(*) = 0- 
iev 
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The cost of flow a; is Y1 Cij%ij- 

A minimum cost flow is a flow having minimum cost. 

A pseudoflow is a function x = ( Xij ) satisfying the arc capacity constraints; it may 
violate the mass balance constraints. 

The residual network G(x) corresponding to a flow (or pseudoflow) x is defined in 
the following manner. Replace each arc (i,j) € E by two arcs (i,j) and (j, i). Arc (i. j) 
has cost Cij and residual capacity rij = Ujj — x ij, and arc (j,i) has cost — Cij and 
residual capacity r 3i = x^j. The residual network consists only of arcs with positive 
residual capacity. 

The potential of vertex i is a quantity n (i) associated with the mass balance constraint 
at vertex i. With respect to a given set of vertex potentials, the reduced cost of an 
arc (i,j) in the residual network G(x) is c£) = — 7r(z) + n(j). 

The cost of path P in G( x) is c(P) = Y1 c ij '■ its reduced cost is c 7r (P) = c-. 

(i,j)eP ( i,j)eP 

A negative cycle is a directed cycle W in G(x) for which c(W) < 0. 


Facts: 

1. The mass balance constraints ensure that the net flow out of each vertex i is equal 
to b(i). Thus, if there is excess flow out of vertex i, then b(i) > 0 and i is a supply 
vertex. If b(i) < 0, then more flow enters i than leaves i, meaning that vertex i is a 
demand vertex. 

2. Minimum cost flows arise in practical problems involving the least cost routing of 
goods, vehicles, and messages in a network. Minimum cost flows can also be used 
in models of warehouse layout, production and inventory problems, scheduling of per- 
sonnel, automatic classification of chromosomes, and racial balancing of schools. (See 
§10.5.3.) 

3. Let { 7 r(z) | i £ V} be any set of vertex potentials. 

• If P is a path from i to j in G(x), then c 7r (P) = c(P) — n(i) + n (j). 

• If W is a cycle in G{ x), then c K (W) = c(W). 

4. Negative cycle optimality conditions: A feasible flow a; is a minimum cost flow if 
and only if the residual network G(x ) contains no negative cycle. 

5. Reduced cost optimality conditions: A feasible flow a: is a minimum cost flow if 
and only if some set of vertex potentials 7 r satisfies c?) > 0 for every arc (i,j) in G(x). 

6. Complementary slackness optimality conditions: A feasible flow a: is a minimum 
cost flow if and only if there exist vertex potentials 7r such that for every arc (i,j) € E: 

• if cE > 0, then x^ = 0; 

• if cjj < 0, then a = u t j ; 

• if 0 < < Uij, then c- = 0. 


Examples: 

1. In the flow network of part (a) of the following figure, b{i) is shown next to each 
vertex i and {cjj,Uij) is shown next to each arc (i,j). 
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(C) 


The function x = ( Xij ) given in part (b) satisfies the mass balance constraints for each 
vertex. For example, the flow out of vertex 2 is £ 2 4 = 6 and the flow into vertex 2 is 
£12 + £32 = 5, so that flow out minus flow in equals 6 — 5 = 1 = 6(2). Also the capacity 
constraints for all arcs are satisfied: e.g., £12 = 4 < 5 = U 12 . Thus £ is a feasible 
flow, with cost 163. The residual network G(x) corresponding to the flow x is shown in 
part (c). Selected arcs of G(x) are labeled with their cost and residual capacity. The 
directed cycle W = [1, 2, 3, 1] in G(x) has cost 11 — 9 — 10 = —8 and so IT is a negative 
cycle. By Fact 4, this flow x is not a minimum cost flow. 

2 . Part (a) of the following figure shows another feasible flow x' for the network in 
Example 1, with cost 155. 


1 



-3 


(a) (b) 

The corresponding residual network G(x') is given in part (b), in which each arc 
is labeled with its cost and its residual capacity. Using the vertex potentials tt = 
(0, —14, —10, —22), the reduced cost of arc (2,1) in the residual network is c 21 = 
—11 — (—14) + 0 = 3; likewise cj 2 = 9 — (—10) — 14 = 5. The remaining reduced 
costs are found to be zero, so c- > 0 for all arcs (i,j) in G(x). By Fact 5, x' is a 
minimum cost flow for the given network. 

3. Alternatively, the optimality of the flow x ' in part (a) of the figure of Example 2 
can be verified using Fact 6. As in Example 2, let 7r = (0, —14, —10, —22). Arc (3,2) 
of the original network G in part (a) of the figure of Example 1 has positive reduced 
cost C 32 = 9 — (—10) — 14 = 5 and £g 2 = 0. Arc (1, 2) has cj 2 = 11 — 0 — 14 = — 3 < 0 
and £' 12 = u± 2 - The remaining arcs (1,3), (2,4), and (3,4) have zero reduced cost. 
Consequently, the complementary slackness optimality conditions are satisfied and the 
flow x' achieves the minimum cost. 
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Algorithm 1 : Cycle-canceling algorithm. 

input: directed network G 
output: minimum cost flow x 

establish a feasible flow x in the network 
while G{x) contains a negative cycle do 
identify a negative cycle W 
6 := min {nj | ( i,j ) G W } 

augment 5 units of flow along W and update G(x) 
recover an optimal flow x from the final residual network G(x) 


10.5.2 ALGORITHMS FOR MINIMUM COST FLOWS 

A variety of algorithms are available to solve the minimum cost flow problem. Three 
algorithms are described in this section: the cycle-canceling algorithm, the successive 
shortest path algorithm, and the network simplex algorithm. 

Definitions: 

Let G = (V,E) be a directed network with n = \V\ and m = |A|; let U denote the 
largest arc capacity and C denote the largest arc cost (in absolute value) in G. 

For a given pseudoflow x = ( Xij ), the imbalance of vertex i G V is e(i ) = b(i ) + 
5Z x ji ~ zC x ij- 

An excess vertex is one with a positive imbalance, and a deficit vertex is one with 
a negative imbalance. 

A spanning tree solution x = ( x ^ ) consists of a spanning tree T of G = {V, E) in 
which each nontree arc (i, j) has either x-ij = 0 or Xi 3 = Uij. 

A spanning tree solution is feasible if the mass balance constraints and capacity con- 
straints are satisfied. 

Facts: 

1. Cycle-canceling algorithm: The cycle-canceling algorithm (Algorithm 1) is based 
on the negative cycle optimality conditions (§10.5.1, Fact 4). It starts with a feasible 
flow and successively augments flow along negative cycles in the residual network until 
there is no negative cycle. 

2. As shown in §10.4.3, an initial feasible flow can be found by solving a maximum 
flow problem. 

3. Integrality property: For problems with integer arc capacities and integer vertex 
supplies/demands, Algorithm 1 starts with an integer flow, at each iteration augments 
by an integral amount of flow, and thus produces an optimal flow that is integer. Thus 
any minimum cost flow problem with integer supplies, demands, and capacities always 
has an optimal solution that is integer. 

4. A negative cycle W in the residual network can be identified in 0{nm) time by using 
a queue implementation of the label-correcting algorithm (§10.3.2, Fact 5). 
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Algorithm 2: Successive shortest path algorithm. 

input: directed network G 
output: minimum cost flow x 

x := 0 

e(z) := b(i) for all i £V 

initialize V + := { i \ e(i ) > 0 } and V~ := { i \ e(i) < 0 } 

while V + ^ 0 do 

select a vertex k £ V + and a vertex l £ V~ 

identify a shortest path P in G(x) from vertex k to vertex l 

6 := min{e(fc),-e(Z),min{rjj | (i,j) £ P}} 

augment 6 units of flow along P 

update e, G(x), V + , and V~ 

recover an optimal flow x from the final residual network G(x) 


5. Augmenting the flow along W by S decreases the residual capacities of arcs in W 
by 5 and increases the residual capacities of the reversals of arcs in W by S. 

6. Upon termination of Algorithm 1, an optimal flow x can be reconstructed from the 
final G(x); see §10.4.2 Fact 10. 

7. For problems with integer supplies, demands, and arc capacities, the cycle-canceling 
algorithm runs in pseudopolynomial time 0(nm 2 CU). 

8. If flow is augmented along a negative cycle W in G(x) that minimizes the ratio 
\W~\ zU(i j)ew c b among all directed cycles in G(x), then this implementation runs in 
polynomial time [GoTa88] . 

9. Successive shortest path algorithm : The successive shortest path algorithm (Algo- 
rithm 2) starts with the pseudoflow x = 0. It proceeds by selecting an excess vertex k 
and a deficit vertex l, and augmenting flow along a minimum cost path from vertex k 
to vertex l in G(x). 

10. If in Algorithm 2 reduced costs c 7. are used instead of arc costs Cij, then Dijkstra’s 
algorithm (§10.3.2, Algorithm 2) can be applied to determine a shortest path P in the 
residual network. 

11. Augmenting the flow along P by S decreases the residual capacities of arcs in P 
by 5 and increases the residual capacities of the reversals of arcs in P by S. It also 
decreases e(k ) by S and increases e(l) by <5. 

12. The solution maintained by the successive shortest path algorithm always satisfies 
the reduced cost optimality conditions (§10.5.1, Fact 5). The final solution is in addition 
feasible, and so is an optimal solution of the minimum cost flow problem. 

13. For problems with integer supplies, demands, and arc capacities, the shortest aug- 
menting path algorithm runs in pseudopolynomial time. 

14. Several implementations of the shortest augmenting path algorithm run in polyno- 
mial or even strongly polynomial time. [Or88] describes an implementation running in 
0(m\ogn(m + nlogn)) time, currently the fastest strongly polynomial-time algorithm 
to solve the minimum cost flow problem. 

15. If a minimum cost flow problem has an optimal solution, then it has an optimal 
spanning tree solution. 
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Algorithm 3: Network simplex algorithm. 

input: directed network G 
output: minimum cost flow x 

determine an initial spanning tree solution with associated tree T 
let x be the flow and 7 r the corresponding vertex potentials 
while some nontree arc violates the complementary slackness optimality con- 
ditions do 

select an entering arc (k,l) violating its optimality condition 
add arc (k, l) to T, augment the maximum possible flow in the cycle thus 
created, and determine the leaving arc (p, q ) 
update the tree T, the flow cc, and the vertex potentials 7 r 


16. Given a spanning tree solution x, with flows on nontree arcs (i, j) specified (at 
either 0 or u t j ) , the flows on the tree arcs are uniquely determined by the mass balance 
constraints. 

17. Given a spanning tree solution x, vertex potentials 7r can be determined such that: 

• tt(1) = 0; 

• c7. = 0 for all tree arcs (i, j). 

18. Complementary slackness optimality conditions: Suppose a; is a feasible spanning 
tree solution with vertex potentials determined as in Fact 17. Then x is a minimum 
cost flow if: 

• Cij > 0 for all nontree arcs (i,j) with x.,j = 0; 

• c?j < 0 for all nontree arcs (i,j) with x,j = ipj. 

19. Network simplex algorithm: The network simplex algorithm (Algorithm 3) is a 
specialized version of the well-known linear programming simplex method (§15.1.3). It 
maintains a spanning tree solution and at each iteration transforms the current spanning 
tree solution into an improved spanning tree solution until optimality is reached. 

20. Using appropriate data structures, the network simplex algorithm can be imple- 
mented very efficiently. The network simplex algorithm is one of the fastest algorithms 
to solve the minimum cost flow problem in practice. 

21. The network simplex algorithm has a exponential worst-case time bound. [Or97] 
provides the first polynomial-time implementations of the (generic) network simplex 
algorithm. 

22. Detailed descriptions of Algorithms 1-3, as well as several other algorithms for 
finding minimum cost flows, can be found in [AhMaOr93]. 

23. Computer codes (in C, Pascal, and Fortran) for solving the minimum cost flow 
problem can be found at the following sites: 

ftp : //dimacs . rutgers . edu/pub/netf low/mincost/ 

http : //www. zib . de/Opt imization/Software/Mcf / 

ftp : //ftp . zib . de/pub/Packages/mathprog/netopt-bertsekas/ 

http : //www.neci .nj .nec . com/homepages/ avg/soft/ soft .html 

http : //orly 1 . snu . ac . kr/ software/ 
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Examples: 

1. The following figure illustrates the cycle-canceling algorithm. 


o 



(c) (d) 

Part (a) of this figure depicts the given flow network, with b{i) shown for each vertex i 
and ( Cij,Uij ) for each arc ( i,j ). Part (b) shows the residual network corresponding to 
the flow x \2 = X 24 = 3 and x\z = £34 = 1. 

In the first iteration, suppose the algorithm selects the negative cycle [2, 3, 4, 2] with 
cost —1. Then 6 = min{r 23 , P 34 , P 42 } = min{2,4,3} = 2 units of flow are augmented 
along this cycle. Part (c) shows the modified residual network. 

In the next iteration, the algorithm selects the cycle [1,3,4, 2, 1] with cost —2 and 
augments 5=1 unit of flow. Part (d) depicts the updated residual network which 
contains no negative cycle, so the algorithm terminates. From part (d), an optimal flow 
pattern is deduced: X 12 = £13 = £23 = 2 and £34 = 4. 

2. The successive shortest path algorithm is illustrated using the flow network in 
part (a) of the figure for Example 1. The initial residual network G(x) for x = 0 is 
the same as that of part (a). Initially, the imbalances are e = (4, 0,0, —4), so that 
V + = {1} and V~ = {4}, giving k = 1 and l = 4. The shortest path from vertex 1 to 4 
in G( x) is [1, 3,4], and the algorithm augments 5 = 2 units of flow along this path. 

The following figure shows the residual network after this augmentation, as well as 
the updated imbalance at each vertex. 



0 0 


(a) (b) 
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The sets V + and V~ do not change, so again k = 1 and l = 4. The shortest path from 
vertex 1 to vertex 4 is now [1,2, 3, 4], and the algorithm augments <5=2 units of flow 
along this path. 

Part (b) of that figure shows the resulting residual network. Now V + = V~ = 0 
and the algorithm terminates. 

3. The following figure illustrates the network simplex algorithm. 



(a) 


(b) 


o 



(c) (d) 

Part (a) of this figure depicts the given flow network, with b(i) shown for each vertex i 
and ( Cij, Uij ) for each arc ( i,j ). 

A feasible spanning tree solution is shown in part (b) of the figure; each nontree 
arc (dashed line) has flow at either its lower or upper bound. The unique flows on the 
tree arcs (solid lines) are determined by the mass balance constraints. A set of vertex 
potentials (obtained using Fact 17) are also shown in part (b). Relative to the these 
potentials 7 r, the reduced costs for the nontree arcs are given by c| 3 = 2 — (—3) — 2 = 3, 
c£ 5 = 4 - (-2) - 5 = 1, c£ 4 = 5 - (-5) -8 = 2, and cj 6 = 3 - (-8) -9 = 2. Since 
arc (3,5), with flow at its upper bound, violates the optimality conditions of Fact 18, it 
is added to the current tree producing the cycle [1,2, 5, 3, 1]. The maximum flow that 
can be sent along this cycle without violating the capacity constraints is 1 unit, which 
forces the flow on arc (2,5) to its upper bound. Arc (2,5) is then removed from the 
current tree and arc (3,5) is added to the current tree. 

Part (c) of the figure gives the new flow as well as the new vertex potentials. Since 
cj 6 = 3 — (—8) — 10 = 1, arc (4,6) is added to the spanning tree, forming the cycle 
[1, 3, 5, 6, 4, 2, 1]. The maximum flow that can be sent along this cycle without violating 
the capacity constraints is 1 unit, which forces arc (3,5) out of the tree. 

Part (d) gives the new flow as well as the new vertex potentials. Since the comple- 
mentary slackness optimality conditions are satisfied, the current flow is optimal. 
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10.5.3 APPLICATIONS 


Minimum cost flow problems arise in many industrial settings and scientific domains, 
often in the form of distribution or routing problems. The minimum cost flow problem 
also has less transparent applications, several of which are presented in this section. 

Applications: 

1. Distribution: A common application of the minimum cost flow problem involves 
the distribution at minimum cost of a product from manufacturing plants (with known 
supplies) to warehouses (with known demands). A similar scenario applies to the dis- 
tribution of goods from warehouses to retailers as well as the flow of raw materials and 
intermediate goods through various machining stations in a production line. 

2 . Routing: The routing of cars through an urban street network and the routing 
of calls through a telephone system can be modeled using minimum cost flows. In 
either case, the items (cars, calls) must be sent from certain specified origins to other 
specified destinations, with capacity constraints on the total flow on each arc (road, 
communication link). This is done to minimize total (or average) delay in the system. 

3 . Directed Chinese postman problem: Leaving from the post office, a mail carrier 
needs to visit all houses on a postal route, delivering and collecting letters, and then 
return to the post office. The carrier would like to cover this route by traveling the 
minimum possible distance. (See also §8.4.3.) In this variation, known as the directed 
Chinese postman problem, each street is assumed to be directed, so the problem is de- 
fined on a directed network G = (V,E) whose arcs (i,j) have an associated nonnegative 
length Cij. It is desired to find a directed walk (§8.3.2) of minimum length that starts 
at some vertex (the post office), visits each arc of the network at least once, and returns 
to the starting vertex. In an optimal walk, some arcs may be traversed more than once. 
If Xij represents the number of times arc (i,j) is traversed, then this problem can be 
formulated as: 

minimize: c ij x iji 

(iJ)CE 

subject to: Y] Xu — Y] x r , = 0 for all i £ V , 

{j\U,i)eE} 

x^ > 1 for all (i, j) € E. 

This problem is a minor variant of the minimum cost flow problem where each arc has 
a lower bound of one unit of flow. From an optimal flow x* for this problem, an optimal 
tour can be constructed in the following manner. First, replace each arc (i,j) with x* 3 
copies of the arc, each carrying a unit flow. Next, decompose the resulting network into 
a set of directed cycles. Finally, connect the directed cycles to form a closed walk. 

4 . Optimal loading of a hopping airplane: A small commuter airline uses a plane with 
capacity of at most p passengers on a “hopping flight”, as shown in part (a) of the 
following figure. The flight visits the cities 1, 2, 3, . . . , n in a fixed sequence. The plane 
can pick up passengers at any city and drop them off at any other city. Let bij denote the 
number of passengers available at city i who want to go to city j, and let © denote the 
fare per passenger from city i to city j. The airline would like to determine the number 
of passengers that the plane should carry between various origins and destinations in 
order to maximize the total fare per trip while never exceeding the capacity of the plane. 

Part (b) of the following figure shows a minimum cost flow formulation of this 
hopping plane flight problem. The network displays data only for those arcs with 
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nonzero cost or finite capacity. Any arc without a displayed cost has zero cost; any 
arc without a displayed capacity has infinite capacity. For example, three types of 
passengers are available at vertex 1: those whose destination is vertex 2, vertex 3, or 
vertex 4. These three types of passengers are represented by the vertices 1-2, 1-3, and 
1-4 with supplies 612 , & 13 , and 614 . A passenger available at any such vertex, say 1-3, 
either boards the plane at its origin vertex by flowing through the arc (1-3, 1), and thus 
incurring a cost of — /13 units, or never boards the plane, represented by flowing through 
the arc (1-3,3). 

© >© > 0 -^ — >© 

(a) 


b 14 ^24 b 34 



(b) 

5. Leveling mountainous terrain: In building road networks through hilly or moun- 
tainous terrain, civil engineers must determine how to distribute earth from high points 
to low points of the terrain to produce a leveled roadbed. To model this, construct a 
terrain graph , an undirected graph G whose vertices represent locations with a demand 
for earth (low points) or locations with a supply of earth (high points). An edge of G in- 
dicates an available route for distributing the earth, and the cost of this edge is the cost 
per truckload of moving earth between the corresponding two locations. The following 
figure shows a portion of a sample terrain graph. A leveling plan for a terrain graph 
is a flow (set of truckloads) that meets the demands at vertices (levels the low points) 
by the available supplies (earth obtained from high points) at minimum trucking cost. 
This can be solved as a minimum cost flow problem on the terrain graph. 
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6. Additional applications, with reference sources, are given in the following table. 


application 

references 

medical tomography 

[AhMaOr93], [AhEtal95] 

automatic chromosome classification 

[AhMaOr93], [AhEtal95] 

racial balancing of schools 

[AhMaOr93], [AhEtal95] 

controlled matrix rounding 

[AhMaOr93], [AhEtal95] 

building evacuation 

[AhMaOr93], [AhEtal95] 

just-in-time scheduling 

[AhMaOr93], [AhEtal95] 

telephone operator scheduling 

[AhMaOr93], [AhEtal95] 

nurse staff scheduling 

[AhMaOr93] 

machine scheduling 

[AhMaOr93] 

production scheduling 

[EvMi92] 

equipment replacement 

[AhMaOr93] 

microdata file merging 

[EvMi92], [AhEtal95] 

warehouse layout 

[AhMaOr93], [AhEtal95] 

facility location 

[AhMaOr93], [AhEtal95] 

determining service districts 

[EvMi92], [AhMaOr93], [AhEtal95] 

capacity expansion 

[AhMaOr93] 

vehicle fleet planning 

[AhMaOr93] 


10.6 COMMUNICATION NETWORKS 


Modern communication networks consist of two main components. Using high-capacity 
links, the backbone network interconnects switching centers and gateway vertices that 
carry and direct traffic through the communication system. Local access networks 
transfer traffic between the backbone network and the end users. This section presents 
several optimization models used in the design of communication networks. 


10.6.1 CAPACITATED MINIMUM SPANNING TREE PROBLEM 

The capacitated minimum spanning tree problem arises in the design of local access 
tree networks in which end users generate and retrieve data from other sources, always 
through a specified control center (e.g., a communication switch of the backbone net- 
work). In this problem, user sites are to be interconnected at minimum cost by means 
of subtrees, which are in turn connected to the control center. The total traffic in each 
subtree is limited by a capacity constraint. 

Definitions: 

Let N = {1, 2, . . . , n} be a set of terminals and let 0 denote a specified control center. 
The complete undirected graph G = (V, E ) has vertex set V = N U {0} and contains 
all possible edges between distinct vertices of V (§8.1.3). 

The cost of connecting distinct vertices i,j £ V is c.y = c e , where e = (i. j) £ E. 
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The demand Wi at vertex i £ N is the amount of traffic to be transmitted to the 
control center. 


Relative to a spanning tree T (§9.2.1) of G, vertex j is a root vertex if it is adjacent to 
vertex 0. Vertex i is assigned to root vertex j if j is on the unique path in T joining i 
to the control center. The set of all vertices assigned to j defines the subtree Tj of T. 
This subtree has demand D(Tj) = ]T) to,;. 

Tj 

A capacitated minimum spanning tree ( CMST ) is a spanning tree T of G com- 
posed of subtrees Tj 1 , Tj 2 , . . . , T) r such that: 

• c e is minimum; 

e6T 

• the demand in each Tj is at most Q, a specified capacity. 

Let i,j £ N and define 


Vj = 


Xij — 


_ f 1 if vertex j is a root vertex 


. 0 otherwise 

’ 1 if vertex i is assigned to root vertex j 
. 0 otherwise 


and for i ^ j 


Z . . = \1 if (M')eT 
13 1 0 otherwise. 

Given a vector z, the subgraph G(N, E z ) of G induced by z has vertex set N and 
edges e G E z if z e > 0. Similarly, given a vector (z,y), the subgraph induced by (z,y), 
written G(V,E zy ), has vertex set V; it contains every edge of E z plus each edge from j 
to 0 where ijj >0. 

Relative to a given vector z, C(i,j) denotes the set of all i-j cuts [S', 5] in the graph 
G{N,E Z ). (See §10.4.1.) 

If S C V, then E(S ) = { (i, j) € E \ i, j € S } contains all edges between vertices of S. 


For / C N, let b{I) be the minimum number of subtrees needed to pack all terminals 
in I. That is, b(I) is the optimal solution to the bin packing problem (§15.3.2) with 
bins of capacity Q and items of size Wi for every i £ I. 


A set S C N is a cover if J2ieS w i > Q- If a l so ~ w k < Q for all k £ S, 

then S is a minimal cover. 


Facts: 


1. The CMST problem has the following 0-1 integer linear programming formulation 
(§15.1.8): 

min i E E CijZij + E cojVj \ 

1 i= 1 j—i+1 j — 1 J 

subject to: 

n 

)T) = 1, for all i £ {1,2,..., n} 

i = i 

n 

X) WiXij < Qyj, for all j £ {1,2,..., n} 

j=i 

< yj, for all i,j £ {1, 2, . . . , n} 
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Algorithm 1 : Savings heuristic. 

input: undirected network G, control center 0, capacity limit Q 
output: an approximate capacitated minimum spanning tree T* 

U := {1, 2, . . . , n} 

T u := {it} for ii £ U 
while true 
for uGU 

compute f u , the minimum cost of connecting the control center 0 to com- 
ponent T u 

S:=0 

for ii, u £ U («/ v ) 

if D(T U U T v ) < Q then 

compute s uv , the difference between max{/„. /<,} and the minimum cost 
of connecting T u to T v 
if s uv > 0 then S := S U {(w, u)} 
if S = 0 then return 
else 

choose u o, Vo such that s UoVo = max{ s uv \ (u, v) £ S } 
merge T Uo and T Vo , creating a new subtree indexed by min{uo> ^o}, and 
update U appropriately 


x ij < for all i,j £ {1,2,..., n} ( i ^ j) and for all K £ C(i,j) 

eG K 

E + E Vj = n 

e j 

£ {0, 1}, for all i, j £ {1,2, . . . ,n} 

Vj £ {0, 1}, for all j £ {1, 2, ... , n} 

Zij £ {0,1}, for all i,j £ {1,2, ...,n} (i ± j). 


2. In Fact 1: 

• the first set of constraints ensures that each vertex is assigned to a root vertex; 

• the second set of constraints ensures that the flow through any root vertex is no 

more than the capacity Q ; 

• third set of constraints ensures that vertex i can be assigned to vertex j only if j 

is a root vertex; 

• the fourth set of constraints ensures that if vertex i is assigned to root vertex j, 

then there must be a path between i and j: 

• the fifth set of constraints guarantees that G(V,E zy ) is a tree. 

3. Savings heuristic: This greedy heuristic (Algorithm 1) begins with n components, 
each a single vertex, and successively merges pairs of components to reduce the total 
cost by the largest amount. 

4. The quantity s uv computed in Algorithm 1 represents the savings in joining sub- 
trees T u and T v to one another, compared to joining both to vertex 0. 

5. The savings heuristic, developed by Esau and Williams [EsWi66], was one of the 
first heuristics developed for the CMST problem. It is surprisingly effective in practice. 
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Algorithm 2: Optimal tour partitioning heuristic. 

input: undirected network G, control center 0, capacity limit Q 
output: an approximate capacitated minimum spanning tree T* 

find a traveling salesman tour on the vertex set V = N U {0} 

let 0 = . . . ,x be an ordering of the vertices on the tour 

construct the directed graph H with vertex set V and arc costs Cjk'- 

if j k and 'y ^ Q then Cjk • Cx(o) 5 a;(j+ 1 ) “I - ^ — j-e 1 GkO^O+r) 

else Cjk '■= oo 

find a shortest path P from x ^ to x ^ in H 
use P = [x^\ x^ u \ x^ v \ . . . , x^\x^] to define T* via the subtrees 
f x (0) ©L x ^ x^\ f© 0 ) ©“+ 1 ) -r("+ 2 ) a-bbt 

/t-(o) <r(*+i) T'(*+ 2 ) -A 11 )! 

1 tC ^ |U .. ^ ^ J 


6. Optimal tour partitioning heuristic: This heuristic (Algorithm 2), developed by 
Altinkemer and Gavish [AlGa88], is based on finding a traveling salesman tour (§10.7.1) 
in a certain derived graph. 

7. In Algorithm 2, every path from to x ^ in the directed graph H generates a 
collection of subtrees satisfying the capacity restriction. 

8. The performance of Algorithm 2 depends on the initial traveling salesman tour 
chosen. If an optimal traveling salesman tour is used, then the worst-case relative error 
bound of the algorithm is 4 — That is, Z /Z* < 4 — where Z is the cost of the 
heuristic solution generated and Z* is the cost of the optimal design. 

9. Exact algorithms: A number of exact algorithms are based on mathematical pro- 
gramming approaches: 

• Gavish [Ga85] develops a Lagrangian relaxation based algorithm and uses it to 

solve problems with homogeneous (unit) demands; 

• Araque, Hall, and Magnanti [ArHaMa90] derive valid inequalities and facets for 

the CMST problem; 

• Hall [Ha96] and Bienstock, Deng, and Simchi-Levi [BiDeSi94] develop valid in- 

equalities and facets and use them in a branch-and-cut algorithm. 

10. The CMST formulation given in Fact 1 can be improved by adding the various 
inequalities listed in Facts 11-14. 

11. Knapsack inequalities: Let S' be a minimal cover. For every l £ N, the inequality 

E x u < (\s\ - l)yi 

ies 

is valid for the CMST problem. 

12. Subtour elimination inequalities: For any I C N, let V = {Si, S 2 , . . . , S|/|} be a 
partition of N — I into |/| subsets, some of which may be empty. For every i £ I, let Si 
be the unique subset from V associated with it. Then 

E z e + E Vj ~b E E x ij < E E x ij 

eeE(i) jci iei jeSi iei jeN 

is valid for the CMST problem. 

13. Generalized subtour elimination inequalities: For any / C N, the inequality 

E 2e< \I\ - b(I) 

eGE(I) 

is valid for the CMST problem. 
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14 . Cluster inequalities: Consider p sets of vertices Si, S 2 , ■ ■ ■ , S p C N with p> 3. If 
the conditions: 

• So = fj Si yf and Si — So 7^ 0 for i = 1, 2 , . . . , p 

i=l 

• w i > Q f°r all 1 < k < l < p 

ieS k uSi 

are satisfied, then 

E T, z e < \ s *\ - 2 p + 1 

?— 1 e£E(Si) *= 1 

is valid for the CMST problem. 


Examples: 

1. The following figure presents data for a problem involving n = 5 terminals and a 
control center 0. Part (a) gives the cost Cjj of constructing each edge (i,j) as well as 
the demand w t at each vertex i. The objective is to construct a minimum cost set of 
subtrees connected to vertex 0, in which the demand generated by any subtree is at most 
Q = 150. Part (b) shows a feasible capacitated spanning tree T, which contains two 
root vertices (at 2 and 3). The total demand in subtree T 2 is W 2 + 104 + 105 = 150 < Q 
and the total demand in subtree T 3 is Wi + w 3 = 95 < Q. The spanning tree T has 
total cost 21, the sum of the displayed edge costs c i: j . 


j 

i 

c ii 

c 0i 


2 

3 

4 

5 

1 

2 

4 

3 

4 

8 

40 

11 t 

2 


5 

2 

3 

5 

35 

3 



6 

5 

7 

55 

4 




4 

6 

50 

5 





6 

65 



(a) (b) 

2. Algorithm 1 is applied to the problem data of the figure of Example 1. To begin, five 
subtrees are selected, each a single vertex and each joined to the control center 0. Thus, 
h = 8, / 2 = 5, h = 7, h = 6, and f 5 = 6. Then a 12 = 8 - 2 = 6, s 13 = 8 - 4 = 4, . . . , 
s 3 5 = 7 — 5 = 2, and S45 = 6 — 4 = 2. The largest savings occurs for (1, 2) so Tj 
and T 2 are merged, giving the new tree Tj with root vertex 2 and the single edge (1, 2). 
Next, the f u and s uv are updated. For example, /j = min{8, 5} = 5, / 3 = 7, and 
si 3 = 7 — min{ci 3 , c 23 } = 7 — 4 = 3. The largest savings is found to be S14, so T\ and T4 
are merged, giving the new tree Ti with root vertex 2 and edges (1,2) and (2,4). At 
the next stage T 3 and T5 are merged, giving the new tree T 3 with root vertex 5 and 
the single edge (3,5). Since no further merging can take place (without violating the 
capacity constraint), the savings heuristic terminates with the spanning tree shown in 
the following figure, having total cost 20. 
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10.6.2 CAPACITATED CONCENTRATOR LOCATION PROBLEM 


The capacitated concentrator location problem is frequently used to locate concentrators 
in local access networks and switching centers in the backbone network. In either case, 
concentrators of fixed capacity are to be located at a subset of possible sites. Each 
given terminal of the network is to be connected to exactly one concentrator, so that 
the concentrator’s capacity is not exceeded. A feasible configuration having minimum 
total cost is then sought. 


Definitions: 


N = {1,2, ...,n} is a specified set of terminals , where terminal i uses Wi units of 
capacity. M = {1,2,..., m} is a given set of possible sites for concentrators, each of 
fixed capacity Q. 

If a concentrator is located at site j, the set-up cost is Vj , for j £ M. The connection 
cost of connecting terminal i to a concentrator at site j is c^ , for i £ N and j £ M. 

The capacitated concentrator location problem ( CCLP ) involves finding loca- 
tions for concentrators and an assignment of terminals to concentrators such that: 

• the sum of set-up and connection costs is minimum; 

• the total capacity required by the terminals assigned to each concentrator is at 

most Q. 


Define 

f 1 if a concentrator is located at site j 
Vi = { 

10 otherwise 

and 

f 1 if terminal i is connected to a concentrator at site j 

Xij =< 

10 otherwise. 


Facts: 

1. The CCLP has the following 0-1 integer linear programming formulation (§15.1.8): 

{ n m m 'j 

E E CijXij + V r,y, \ 

*= 1 j = 1 j= i J 

subject to 

m 

= 1 for all i £ N 
i= i 

n 

1 WiXij < Qyj for all j £ M 

i = 1 

< yj for all i £ N, j £ M 
x^ £ {0, 1} for all i £ N, j £ M 
yj £ {0, 1 } for all j £ M. 


2. In Fact 1: 

• the first set of constraints ensures that each terminal is connected to exactly one 

concentrator; 

• the second set of constraints ensures that the concentrator’s capacity is not ex- 

ceeded; 

• third set of constraints ensures that terminal i can be assigned to a concentrator 

at site j only if a concentrator is located at site j. 
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3. A number of algorithms have been proposed for this problem, most of which are 
based on a Lagrangian relaxation approach, while some are based on polyhedral analysis 
[NeWo88] . 

Example: 

1. The following figure shows the data for a problem with four terminals i and three 
possible sites j for locating concentrators. Let the capacity of any concentrator be 
Q = 30. One feasible configuration is to connect terminals 2, 3, and 4 to a concentrator 
at site 2, and to connect terminal 1 to a concentrator at site 3. The concentrator at 
site 2 has total capacity w 2 + W 3 + W 4 = 29 < 30 and the concentrator at site 3 has 
total capacity Wi = 11 < 30. The connection cost is C22 + C32 + C42 + C13 = 15 and the 
set-up cost is i >2 + V 3 = 15, giving a total cost of 30. Another feasible configuration is to 
connect terminals 2 and 4 to a concentrator at site 1, and to connect terminals 1 and 3 
to a concentrator at site 3. The connection cost is 12 and the set-up cost is 17, giving 
the smaller total cost 29. 
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10.6.3 CAPACITY ASSIGNMENT PROBLEM 

Fiber-optic and opto-electronic cable technologies, together with traditional copper ca- 
bles, provide many possible choices for link capacities and offer economies of scale. In 
the capacity assignment problem, a point-to-point communication demand is given be- 
tween various pairs of vertices of the (typically, backbone) network. The objective is to 
install links of several types (capacities) to transfer all communication demand without 
violating link capacities and to do so at minimum total cost. The special case involving 
two types of transmission media is discussed here. 

Definitions: 

Let G = (V, E) be an undirected graph with vertex set V and edge set E. 

Each communication demand is represented by a commodity k £ K, where K is the 
set of commodities. Commodity k £ K has a required flow in G of dk units between its 

origin vertex 0(k) and its destination vertex D(k). 

Two types of cables can be installed: low capacity cables have capacity L , and high 
capacity cables have capacity H . Let a e ( b e ) be the installation cost for each low 
capacity (high capacity) cable on edge e £ E. 

The capacity assignment problem (CAP) involves finding a mix of low and high 
capacity cables for each edge of G such that: 

• the total installation cost is minimum; 

• all communication demands dk are met; 

• the flow on each edge does not exceed its installed capacity. 
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Let x e = Xjj ( y e = y.,j ) be the number of low capacity (high capacity) cables installed 
on edge e = ( i,j ). 

Let fl'j be the amount of commodity k that flows from i to j on edge (i, j). 


Facts: 

1. The CAP has the following mixed-integer linear programming formulation (§15.1.8): 

min< (a e x e + b e y e ) \ 

V edE > 


subject to 


E fi.o(k) - E fo(k),j = ~ d k f or all k G K 


Jj,Olk ) 
jev jev 

f j,D(k) 

jev jev 


E fk 

•^i,D(k) J D(k),j 


= dk for all k G K 


E fu ~ E fij = 0 f or all & e A' and for all i G V — {O(k), D(k)} 
jev jev 

E ( fij + fji ) < Lxij + H yij for all (i,j) G E 

keK 

Xe^Ve > 0 integer for all e G E 


fij, fji >0 for all (i,j) G E and for all k G K. 


2. In Fact 1: 

• the first three sets of constraints are the standard mass balance constraints 

(§10.4.1); 

• the next set of constraints enforces the capacity constraint on the total flow 

through edge e = 

3. Various models and algorithms for capacity assignment problems are discussed in 
[MaMiVa95] and [BiGu95]. 


Example: 

1. In the network G of the following figure, the costs (a e , b e ) are shown for each edge e; 
here L = 2 and H = 5. There are k = 3 communication demands (commodities): 
d\ = 12 between vertices 1 and 4, = 10 between vertices 2 and 5, and d% = 9 between 

vertices 1 and 5. A feasible assignment of flows and capacities to edges is displayed in 
part (b) of the following figure. For instance, edge (1, 3) carries 7 units of commodity 1 
and 9 units of commodity 3, for a total flow of 16 units. There are 3 high capacity cables 
and 1 low capacity cable installed on this edge giving a total capacity of 3 H + L = 17, 
at a cost of 3 • 5 + 1 • 3 = 18. The total installation cost for this assignment is 114. 
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10.6.4 MODELS FOR SURVIVABLE NETWORKS 


The introduction of fiber-optic technology has provided high capacity links and makes 
it possible to design communication networks with low-cost sparse topologies. Unfortu- 
nately, sparse networks are very vulnerable; a failure in one edge or vertex can disconnect 
many users from the rest of the network. This is the prime motivation for studying the 
design of survivable networks. 

Definitions: 

Let G = ( V , E) be an undirected graph with vertex set V and edge set E. 

The cost of establishing edge e £ E is given by c e . The cost of a subnetwork H = (V, F) 
of G is c e . 

eG F 

Associated with every vertex s £ V is a corresponding number r s , indicating a desired 
level of redundancy. 

A spanning subnetwork H = (V,F) of G is said to satisfy the edge (vertex) connec- 
tivity requirement if for every distinct pair s,t € V there are at least r s t = min{r s , r t } 
edge-disjoint (vertex-disjoint) paths between s and t in H. 

Define x e , for e = (i, j), to be the number of edges connecting vertex i to vertex j. 

Facts: 

1 . The problem of designing a minimum cost subnetwork that satisfies all edge connec- 
tivity requirements has the following integer linear programming formulation (§15.1.8): 

min-j J2 c e%e j- 
L eG E } 

subject to 

x e> max rij, for all S C V, 5^0 

ee[s,s] (*d)e[s,s] 

x e > 0 integer, for all e £ E. 

2 . The model in Fact 1, analyzed by Goemans and Bertsimas [GoBe93], allows multiple 
edges connecting the same two vertices. 

3. Grotschel, Monma, and Stoer [GrMoSt92] analyze a related survivability model in 
which multiple edges are forbidden. In this case, x e is restricted to be 0 or 1 in the 
formulation of Fact 1. 

Examples: 

1. Part (a) of the following figure shows a network G having four vertices and six edges; 
the cost c e of each edge e is also displayed. 
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Suppose that the specified redundancies are r 1 = 1 and r 2 = = 2. The spanning 

subnetwork H shown in part (b), with cost 22, satisfies the vertex connectivity require- 
ment. For example, there are min{r 3 ,r 4 } = 2 vertex-disjoint paths joining vertices 3 
and 4 in H: namely, [3,2,4] and [3, 1,4]. Also, there are min{r 2 ,r 4 } = 2 vertex-disjoint 
paths joining vertices 2 and 4: [2,4] and [2, 3, 1,4]. 

2 . Part (c) of the figure for Example 1 shows another spanning subnetwork H' of 
cost 20 that satisfies the stated vertex connectivity requirement. For example, there are 
min{r 2 ,r 4 } = 2 vertex-disjoint paths joining vertices 2 and 4 in H [2,4] and [2,3,4]. 
Notice that there is min{ri,r 3 } = 1 path joining 1 and 3 in H ' , but not two such 
vertex-disjoint paths. 


10.7 DIFFICULT ROUTING AND ASSIGNMENT PROBLEMS 


An exact algorithm for a combinatorial optimization problem is a procedure that pro- 
duces a verifiable optimal solution to every instance of this problem. A heuristic algo- 
rithm produces a feasible (although not necessarily optimal) solution to each problem 
instance. This section discusses exact and heuristic approaches to three classical com- 
binatorial optimization problems: the traveling salesman problem, the vehicle routing 
problem, and the quadratic assignment problem. These three problems have in com- 
mon the goal of minimizing the cost of movement or travel, generally of people or of 
materials. 


10.7.1 TRAVELING SALESMAN PROBLEM 

In the traveling salesman problem, a salesman starts out from a home city and is to 
visit in some order a specified set of cities, returning home at the end. This journey 
is to be designed to incur the minimum total cost (or distance). While the traveling 
salesman problem has attracted the attention of many mathematicians and computer 
scientists, it has resisted attempts to develop an efficient solution algorithm. 

Definitions: 

Let G = (V,E) be a complete graph (§8.1.3) with V = {1,2, . . . ,n} the set of vertices 
and E the set of all edges joining pairs of distinct vertices. 

Each edge (i,j) £ E has an associated cost or distance Cij. 

The distance between set S C V and vertex j /£S is ds(j) = min{ dj \ i € S }. 

A Hamilton cycle or tour in G is a cycle passing through each vertex i £ V exactly 
once. (See §8.4.4.) 

The cost of a cycle C is ^ dj- 

The traveling salesman problem ( TSP ) requires finding a Hamilton cycle in G of 
minimum total cost. 

The costs (distances) satisfy the triangle inequality if dj < dk + Ckj holds for all 
distinct i,j, k £ V. 
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In a Euclidean TSP, each vertex i corresponds to a point Xi in 1Z 2 and Cy is the 
distance between Xi and Xj, relative to the standard real inner product (§6.1.4). 

Tour construction procedures generate an approximately optimal TSP tour from the 
costs c,j. 

Tour improvement procedures attempt to find a smaller cost tour given an initial 
(often random) tour. 

Composite ( hybrid ) procedures construct a starting tour from one of the tour con- 
struction procedures and then attempt to find a smaller cost tour using one or more of 
the tour improvement procedures. 

A k-change or k-opt exchange of a given tour is obtained by deleting k edges from 
the tour and adding k other edges to form a new tour. A tour is k-optimal (k-opt) if 
it is not possible to improve the tour via a fc-change. 

Metaheuristics are general-purpose procedures (such as tabu search, simulated an- 
nealing, genetic algorithms, or neural networks) for heuristically solving difficult opti- 
mization problems; these general methodologies for searching complex solution spaces 
can be specialized to handle specific types of optimization problems. 

Facts: 

1. The TSP is possibly the most well-known network optimization problem, and it 
serves as a prototype for difficult combinatorial optimization problems in the theory of 
algorithmic complexity (§16.5.2). 

2 . The first use of the term “traveling salesman problem” in a mathematical context 
appears to have occurred in 1931-1932. 

3 . There are numerous applications of the TSP: drilling of printed circuit boards, clus- 
ter analysis, sequencing of jobs, x-ray crystallography, archaeology, cutting stock prob- 
lems, robotics, and order-picking in a warehouse (see Examples 7-11). 

4 . There are different Hamilton cycles in the complete graph G. This means 

that brute force enumeration of all Hamilton cycles to solve the TSP is not practical. 
(See Example 1.) 

5 . The TSP is an NP-hard optimization problem (§16.5.2). This remains true even 
when the distances satisfy the triangle inequality or represent Euclidean distances. 

6. If certain edges (i,j) of G are missing, then Cy can be assigned a sufficiently large 
value M — for example, M greater than the sum of the n largest edge costs. The TSP 
can then be solved on the complete graph G. If the (exact) solution obtained has any 
edges with cost M, then there is no Hamilton cycle in the original graph. 

7 . Asymmetric traveling salesman problem : Certain applications require finding a 
minimum cost directed Hamilton cycle in a directed network H\ here it is not required 
that Cy = c,ji holds for all arcs (i,j) of H. This asymmetric (directed) TSP can be 
transformed into a TSP problem on an undirected network; see [JiiReRi95]. 

8. A seminal paper of G.B.Dantzig, D. R. Fulkerson, and S.M. Johnson (1954) solved 
a 49-city TSP to optimality by adding cutting planes (§15.1.8) to a linear programming 
relaxation of the problem. 

9 . Although ingenious exact algorithms for the TSP have been proposed by numerous 
authors, most encounter problems with storage and/or running time for cases with more 
than five hundred vertices. 
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Algorithm 1 : Nearest neighbor heuristic. 

input: undirected network G = ( V,E ) 
output: a traveling salesman tour 

?’o := any vertex of G {the starting vertex} 

W := V- M 
P := 0 
v := i 0 

while W ^ 0 

let k € W be such that c v k = min{ c v j \ j € W } 
add (v, k) to P 
W := W - {fc} 
v := k 

add (k, i 0 ) to the path P to produce a tour 


10 . Exact approaches to the TSP are computationally intensive, especially for large 
networks. Thus a large number of heuristic approaches have been developed to produce 
useful, but not necessarily optimal, solutions to the TSP. 

1 1 . The wealth of TSP heuristics can be categorized into four broad classes — tour con- 
struction procedures, tour improvement procedures, composite procedures, and meta- 
heuristics. 

12 . Nearest neighbor heuristic: This construction method (Algorithm 1) builds up a 
tour by successively adding new vertices that are closest to a growing path. 

13 . Using appropriate data structures, Algorithm 1 can be implemented to run in 0(n 2 ) 
time. 

14 . Suppose z NN is the cost of a tour constructed by the nearest neighbor heuristic 
and 2 opt is the cost of an optimal TSP tour. Then there are examples for which ^ NN 
is 0(logn). This means that the cost of the tour produced by Algorithm 1 cannot be 
bounded above by a constant times the cost of an optimal TSP tour. 

15 . Nearest insertion heuristic: This construction method (Algorithm 2) builds up a 
tour from smaller cycles by successively adding a vertex that is closest to the current 
cycle C . The new vertex is inserted between two successive vertices in the cycle, in the 
best possible way. 

16 . Using appropriate data structures, Algorithm 2 can be implemented to run in 0(n 2 ) 
time. 

17 . Suppose z NI is the cost of a tour constructed by the nearest insertion heuristic 
and that z OPT is the cost of an optimal TSP tour. If the values Cij satisfy the triangle 
inequality, then " NI < 2 holds for all TSP instances. 

18 . Clarke and Wright savings heuristic: This construction method (Algorithm 3) 
builds up a tour by successively adding an edge (i,j) having the largest savings Sij, the 
benefit from directly connecting vertices i and j compared with joining each directly to 
a central vertex. 

19 . Using appropriate data structures, Algorithm 3 can be implemented to run in 
0(n 2 log n) time. 
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Algorithm 2: Nearest insertion heuristic. 

input: undirected network G = (V, E) 
output: a traveling salesman tour 

i := any vertex of G {the starting vertex} 
j := subscript such that Cjy = min{ Cj r | r € V — {*} } 

S ■= {i,j} 

C := 

while S V 

let k be such that ds(k) = min{ ds(r) \ r € V — S } 

S:=S U {k} 

find an edge (u, v) G C so c uk + c kv - c uv = min{ c xk + c ky - c xy \ (x, y) G C } 
add (u,k) and (k,v) to C, and remove (u,v) from C 


Algorithm 3: Clarke and Wright savings heuristic. 

input: undirected network G 
output: a traveling salesman tour 

select any vertex (for example, 1) as the starting vertex 
compute Sij = cu + c\j — Cij for distinct i,j £ V — {1} 
order the savings Si 1 j 1 > Si 2 j 2 > • • • > Si t j t 

P:=0 
k := 0 

while |P| < n — 2 
k := k + 1 

if PU {(ife, j k )} is a vertex-disjoint union of paths then add (■ i k ,j k ) to P 
connect the endpoints of P to vertex 1, forming a tour 


Algorithm 4: Christofides’ heuristic. 

input: undirected network G 
output: a traveling salesman tour 

T := minimum spanning tree of G (see §10.1) 
let S contain all odd-degree vertices in T 

find a minimum cost perfect matching M (§10.2) relative to vertices S of G and 
using the costs cy 

obtain a closed trail C by adding M to the edges of T 

remove all edges but two incident with vertices of degree greater than 2 by ex- 
ploiting the triangle inequality, transforming C into a tour 


20. Christofides’ heuristic: This construction method (Algorithm 4) builds up a tour 
from a minimum spanning tree to which are added certain other small cost edges. It is 
assumed that the costs satisfy the triangle inequality. 

21. Using appropriate data structures, Algorithm 4 can be implemented to run in 0(n 3 ) 
time. 
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Algorithm 5: General edge-exchange heuristic. 

input: undirected network G, initial tour 
output: a traveling salesman tour 

repeat improve the tour using an allowable edge exchange 
until no additional improvement can be made 


22. Suppose z c is the cost of a tour constructed by Christofides’ heuristic and z OPT is 
the cost of an optimal TSP tour. If the c, 7 - satisfy the triangle inequality, then ~ c < § 

Z OPT 

holds for all TSP instances. 

23. The following table [JiiReRi95] compares several of the most popular tour con- 
struction procedures on a set of 30 Euclidean TSPs from the literature with known 
optimal solutions. These problems range in size from 105 to 2392 vertices. Surprisingly, 
the savings heuristic is the best tour construction heuristic of those tested. These results 
are consistent with those of other studies. 


heuristic 

average percent above optimality 

nearest neighbor 

24.2 

nearest insertion 

20.0 

Christofides 

19.5 

modified nearest neighbor 

18.6 

cheapest insertion 

16.8 

random insertion 

11.1 

farthest insertion 

9.9 

savings 

9.8 

modified savings 

9.6 


24. The best known tour improvement heuristics for the TSP involve edge exchanges 
(Algorithm 5). Often the initial tour is chosen randomly from the set of all possible 
tours. 

25. Specialized versions of Algorithm 5 typically use 2-opt exchanges, 3-opt exchanges, 
and more complicated Lin-Kernighan [JiiReRi95] edge exchanges. Such exchange tech- 
niques have been used to generate excellent solutions to large-scale TSPs in a reasonable 
amount of time. 

26. Edge-exchange procedures are typically more expensive computationally than tour 
construction procedures. 

27. Tour improvement procedures typically require a “downhill move” (i.e. , a strict 
reduction in cost) in order for edge exchanges to be made. As a result, they terminate 
with a local minimum solution. 

28. Since the 2-opt exchange procedure is weaker than the 3-opt procedure, Algo- 
rithm 5 will generally terminate at an inferior local optimum using 2-opt exchanges 
instead of 3-opt exchanges. The Lin-Kernighan procedure will generally terminate with 
a better local optimum than will a 3-opt exchange procedure. 
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29. In practice, it often makes sense to apply a composite procedure. The strategy is 
to get a good initial solution rapidly (by tour construction) , which is then improved by 
an edge-exchange procedure. 

30. The following table [JiiReRi95] compares several composite procedures on the same 
sample problems described in Fact 23. In each case, the initial tour is constructed using 
the nearest neighbor heuristic (Algorithm 1). The improvement procedures include 
2-opt, 3-opt, two variants of Lin-Kernighan, and iterated Lin-Kernighan. Iterated Lin- 
Kernighan is the most computationally burdensome of the edge-exchange procedures, 
but it consistently obtains results that are within 1% of optimality. 


heuristic 

average percent above optimality 

2-opt 

8.3 

3-opt 

3.8 

Lin-Kernighan (variant 1) 

1.9 

Lin-Kernighan (variant 2) 

1.5 

Iterated Lin-Kernighan 

0.6 


31. Metaheuristics: Unlike Algorithm 5 (which permits only downhill moves), meta- 
heuristics [OsKe95] allow the possibility of nonimproving moves. For example, uphill 
moves can be accepted either randomly (simulated annealing) or based upon deter- 
ministic rules (threshold accepting). Memory can be incorporated in order to prevent 
revisiting local minima already evaluated and to encourage discovering new ones (tabu 
search). 

Other metaheuristics such as evolutionary strategies, genetic algorithms, and neural 
networks have also been applied to the TSP. To date, neural networks and tabu search 
have been less successful than the other approaches. 

32. For a detailed history of the traveling salesman problem see the first chapter of 
[LaEtal85] . 

33. Software, research papers, and other heuristic approaches for the traveling salesman 
and related problems are described on the web pages: 

http : //www. ing .unlp . edu . ar/ cetad/mos/TSPBIB Jiome .html 
http : //www.netlib. org/toms/750 

34. A library of sample problems, with their best known solutions, is available at: 

http : //www. iwr . uni-heidelberg.de/ iwr/ comopt/ soft 
/TSPLIB95/TSPLIB . html 

Examples: 

1. Brute force enumeration: Suppose that a TSP solution is required for the complete 
graph G on n = 25 cities. By Fact 4, there are 29 ! ss 3.1 x 10 23 Hamilton tours in the 
graph G. Even with a supercomputer that is capable of finding and evaluating each 
such tour in one nanosecond (10 -9 seconds), it would take over 9.8 million years of 
uninterrupted computations to determine an optimal TSP tour. 

This example illustrates how quickly brute force enumeration of Hamilton tours 
becomes impractical. 
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2 . Part (a) of the following figure shows the costs c l:j for a five city TSP. An initial tour 
can be constructed using the nearest neighbor heuristic (Algorithm 1). Let the initial 
vertex be ?'o = 1, so W = {2,3,4, 5}. The closest vertex of W to 1 is 2, with c \2 = 1, 
so edge (1,2) is added to the current path. A closest vertex of W = {3,4,5} to 2 is 5, 
so edge (2,5) is added to the path. Continuing in this way, edges (5,3) and edge (3,4) 
are added, giving the path P = [1,2, 5, 3, 4] and the tour [1, 2, 5, 3, 4, 1] with total cost 
1 + 3 + 2 + 3 + 5 = 14. This tour is displayed in part (b). 



3 . Suppose that the nearest insertion heuristic (Algorithm 2) is applied to the problem 
data in part (a) of the figure for Example 2, starting with the initial vertex i = 1. The 
nearest vertex to i is j = 2, giving the initial cycle C = {(1, 2), (2, 1)}. The closest vertex 
to this cycle is k = 3, producing the new cycle C = { (1,2), (2,3), (3,1)}. Relative to 
S = {1,2,3}, ds( 4) = 3 and ds( 5) = 2 , so vertex 5 will next be added to the cycle. 
Since ci 5 + c 5 2 — C 12 = 6 , C 25 + C 53 - C 23 = 2, and C 15 + C 53 — C 13 = 4, vertex 5 is inserted 
between vertices 2 and 3 in the current cycle, giving C = {(1,2), (2,5), (5,3), (3,1)}. 
Finally, vertex 4 is added between vertices 2 and 5, producing the tour C = {(1,2), 
(2,4), (4,5), (5,3), (3,1)} with total cost 12. 

4 . The savings heuristic (Algorithm 3) can alternatively be applied to the problem 
specified in part (a) of the figure of Example 2. The savings S 23 = C 12 + C 13 — C 23 = 
1 + 2 — 3 = 0. Similarly, s 2 4 = 2, S 25 = 2, S34 = 4, S35 = 4, and S 45 = 6 . This produces 
the ordered list of edges [(4, 5), (3, 4), (3, 5), (2, 4), (2, 5), (2, 3)]. Considering edges in 
turn from this list gives the path P = [3, 4, 5, 2]. Adding edges from the endpoints of P 
to vertex 1 produces the tour [1, 3, 4, 5, 2, 1] with total cost 12. 

5 . Christofides’ heuristic (Algorithm 4) is now applied to the problem given in part (a) 
of the figure of Example 1. A minimum spanning tree T consists of the following edges: 
(1,2), (1,3), (3,5), (3,4); see part (a) of the following figure. Vertices 2, 3,4, 5 have 
odd degree and {(2, 4), (3, 5)} constitutes a minimum cost perfect matching on these 
vertices. Adding these edges to those of T produces the multi-graph in part (b) of the 
following figure. Replacing edges (4, 3) and (3, 5) having aggregate cost 5 by the single 
edge (4,5) of cost 3 produces the tour in part (c), having total cost 12. 



(a) (b) (c) 
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6. To illustrate edge exchanges, consider the tour of cost 14 in part (b) of the figure 
of Example 2. Removal of edges (1,4) and (3,5) disconnects the cycle into two disjoint 
paths. Join the endpoints of one path to the endpoints of the other with edges (1, 3) 
and (4,5) to create a new tour [1, 2, 5, 4, 3, 1] of smaller cost 12. No further pairwise 
exchanges reduce the cost of this tour, so this tour is a 2-opt local minimum solution. 

7. Delivery routes: A delivery truck must visit a set of customers in a city and then 
return to the central garage after completing the route. Determining an optimal (i.e., 
shortest time) delivery route can be modeled as a traveling salesman problem on a city 
street network. Here the vertices represent the customer locations and the cost of 
edge (i,j) is the driving time between locations i and j. 

8. Printed circuit boards: One application of the TSP occurs in fabricating printed 
circuit boards. Holes at a number of fixed locations have to be drilled through the 
board. The objective is to minimize the total time needed to move the drilling head 
from position to position. Here the vertices i correspond to the locations of the holes 
as well as the starting position of the drill. The cost Cj ; represents the time required to 
move the drilling head from i and reposition it at j. A minimum cost traveling salesman 
tour gives an optimal way of sequencing the drilling of the holes. 

9. Order-picking: In a warehouse, a customer order requires a certain subset of the 
items stored there. A vehicle must be sent to pick up these items and then return to 
the central dispatch location. Here the vertices are the locations of the items as well 
as the central dispatch location. The costs are the times needed to move the vehicle 
from one location to the other. A minimum cost traveling salesman tour then gives an 
optimal order in which to retrieve items from the warehouse. 

10. Job sequencing: In a factory, materials must be processed by a series of operations 
on a machine. The set-up time between operations varies depending on the order in 
which the operations are scheduled. Determining an optimal ordering that minimizes 
the total set-up time can be formulated as a traveling salesman problem. 

11 . Additional applications, with reference sources, are given in the following table. 


application 

references 

dating archaeological finds 

DNA mapping 

x-ray crystallography 

engine design 

robotics 

clustering 

cutting stock problems 
aircraft route assignment 
computer wiring 

[AhEtal95] 

[AhEtal95] 

[JiiReRi95] 

[AhEtal95] 

[JiiReRi95] 

[LaEtal85], [AhEtal95] 

[HoPa96] 

[HoPa96] 

[LaEtal85], [EvMi92], [JiiReRi95] 


10.7.2 VEHICLE ROUTING PROBLEM 

Private firms and public organizations that distribute goods or provide services to cus- 
tomer locations rely on a fleet of vehicles. Given demands for service at numerous 
points in a transportation network, the vehicle routing problem requires determining 
which customers are to be serviced by each vehicle and the order in which customers 
on a route are to be visited. 
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Definitions: 


Let G = (V, E) be a complete graph with V = {1,2, ... ,n} the set of vertices and E 
the set of all edges joining pairs of distinct vertices (§8.1.3). 

Vertex 1 is the central depot, whereas the other vertices represent customer loca- 
tions. Customer location i has a known demand Wi . 

Each edge (i, j) £ E has an associated distance or cost Cij. 

There are also available a number of vehicles, each having the same capacity Q. 

A route is sequence of customers visited by a vehicle that starts and ends at the central 
depot. 

The vehicle routing problem ( VRP ) requires partitioning the set of customers into 
a set of delivery routes such that: 

• the total distance traveled by all vehicles is minimum; 

• the total demand generated by the customers assigned to each route is < Q. 

In a construction heuristic for the VRP, subtours are joined as long as the resulting 
subtour does not violate the vehicle capacity. 

An improvement heuristic employs successive edge exchanges that reduce the total 
distance without violating any vehicle capacity constraint. 

A two-phase heuristic implements a cluster first-route second philosophy, in which 

customers are first partitioned into groups Gk with to,; < Q, after which a minimum 

i£Gk 

distance sequencing of customers is found within each group. 


Facts: 

1. The TSP (§10.7.1) is a special case of the VRP in which there is a single vehicle 
with unlimited capacity. 

2. The VRP is an NP-hard optimization problem (§16.5.2). 

3 . VRPs with more than 50 vertices are difficult to solve to optimality. 

4 . Most solution strategies for large VRPs are heuristic in nature, involving construc- 
tion, improvement, and two-phase methods as well as metaheuristics. 

5 . In 1959 G. B. Dantzig and J. H. Ramser first formulated the general vehicle routing 
problem and developed a heuristic solution procedure. This solution technique was 
applied to a problem involving the delivery of gasoline to service stations. 

6. The Clarke and Wright savings heuristic (§10.7.1) is a construction approach, orig- 
inally proposed for the VRP. Algorithm 6 outlines this heuristic, which begins with 
each customer served by a different vehicle and successively combines routes in order of 
nonincreasing savings s.y = Cn + C\ j — to form a smaller set of feasible routes. 

7 . In two-phase methods, a minimum distance ordering of customers within each spec- 
ified cluster of vertices can be found by solving a TSP (§10.7.1). 

8. In recent years, metaheuristics such as simulated annealing and tabu search have 
been applied quite successfully to VRPs. In particular, on the twelve benchmark prob- 
lems in the literature, which range in size from 50 to 199 vertices, tabu search heuristics 
currently outperform the competition [GeHeLa94]. 
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Algorithm 6: Clarke and Wright savings heuristic. 

input: undirected network G, capacity limit Q 
output: a set of delivery routes 

Ri := route consisting of edges (1, i) and («, 1) for i £ V — {1} 
compute Sij = cu + C\j — Cij for distinct i,j € V — {1} 
order the savings Si 1 j 1 > Si 2 j 2 > ■ • • > Si t j t 

for k := 1 to t 

if Ri k and Rj k have combined demand at most Q then merge Ri k and R, :J 


9. Extensions to the basic VRP include modifications for asymmetric distances (cy 
need not equal Cji ), differing vehicle capacities, constraints on the total distance traveled, 
multiple depots, and constraints on the time intervals for visiting customers. 

10. A survey of 20 commercial software products for vehicle routing problems is avail- 
able in [HaPa97]. This survey discusses interfaces with geographic information systems, 
computer platforms supported, extensions to the basic VRP that are incorporated, and 
significant installations of the product for industrial customers. 

11 . Data sets, software, and research papers on vehicle routing problems are available 
on the web page: 

http : //www. geocities . com/Resear chTr iangle/7279/vrp .html 

12 . Data sets and software for vehicle routing problems with time windows are available 
on the web page: 

http : // dmawww . epf 1 . ch/~rochat/rochat_data/ solomon . html 


Examples: 

1 . The following table gives the data for a VRP involving six customers in which vehicle 
capacity is 820. The route [1, 2, 4, 6 , 1] is not feasible since the total demand of customers 
on this route is W 2 + + wq = 486 + 326 + 24 = 836 > 820. Since ^J _ 2 Wi = 1967, 

at least [1967/820] = 3 routes will be needed to service all demands. The routes 
[1,5, 2, 6,1], [1,3,1], and [1,4, 7,1] constitute a feasible set of routes with (respective) 
demands 800, 541, and 626. In this feasible solution, the total distance traveled by the 
first vehicle is C 15 + C 52 + C 26 + C 6 i = 131, by the second is C 13 + C 31 = 114, and by the 
third is C 14 + C 47 + C 71 = 181, for a total distance of 426. 


customer 

2 

3 

4 

5 

6 

7 

demand 

486 

541 

326 

290 

24 

300 


Cij 

2 3 4 5 6 7 

1 

19 57 51 49 4 92 

2 

51 10 53 25 53 

3 

49 18 30 47 

4 

50 11 38 

5 

68 9 

6 

94 
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2. The Clarke and Wright heuristic is applied to the problem specified in the table of 
Example 1. For instance, S35 = C13 + C15 — C35 = 57 + 49 — 18 = 88. The largest savings 
occurs for S57 = 132 and UJ5 + wj = 590 < 820, so the initial routes [1, 5, 1] and [1, 7, 1] 
are merged to produce the feasible route [1,5,7, 1] with distance 150. The next largest 
savings occur for S47 = 105, S37 = 102, and S35 = 88; however, neither customer 3 
nor customer 4 can be inserted into the route [1, 5, 7, 1] without exceeding the vehicle 
capacity. The next largest savings is S24 = 60, giving the new feasible route [1,2,4, 1] 
with demand 812 and distance 80. Continuing in this fashion eventually finds S36 = 31 
and constructs the route [1,3, 6,1] with demand 565 and distance 91. This feasible set 
of three routes has total distance 150 + 80 + 91 = 321, smaller than that for the feasible 
solution given in Example 1. 


10.7.3 QUADRATIC ASSIGNMENT PROBLEM 

The quadratic assignment problem deals with the relative location of facilities that 
interact with one another in some manner. The objective is to minimize the total cost 
of interactions between facilities, with distance often used as a surrogate for measures 
such as dollar cost, fatigue, or inconvenience. 

Definitions: 

There are n facilities to be assigned to n predefined locations , where each location 
can accommodate any one facility. 

The fixed cost Cij is the cost of assigning facility i to location j. 

The flow fij is the level of interaction between facilities i and j. 

The distance dij between locations i and j is the per unit cost of interaction between 
the two locations. Typically, it is measured using the rectilinear or Euclidean distance 
between the locations. 

An assignment is a bijection p from the set of facilities onto the set of locations. 

The linear assignment problem (LAP) is the problem of finding an assignment p 
that minimizes Y2i c i,p(i)- 

The quadratic assignment problem ( QAP ) is the problem of finding an assign- 
ment p that gives the minimum value z QAP of J2 t c i, P (i ) + J2, :P fip d p (i),p( p )- 

In some partial assignment for the QAP, let T be the set of facilities (possibly empty) 
that have already been assigned and C be the set of locations having assigned facilities. 

Facts: 

1. The following table gives a variety of situations that can be formulated using the 
QAP model. 


facilities 

interaction 

departments in a manufacturing plant 
departments in an office building 
departments in a hospital 
buildings on a campus 
electronic component boards 
typewriter/computer keyboard keys 

flow of materials 

flow of information, movement of people 
movement of patients and medical staff 
movement of students and staff 

connections 
movement of fingers 
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2 . The interdependence of facilities due to interactions between them leads to the 
quadratic nature of the objective function in the QAP. 

3 . If the facilities are independent of each other (there are no interactions between 
them), the QAP reduces to the LAP, which can be solved in polynomial time (§10.2.2). 

4 . The TSP is a special case of the QAP (see Example 4). 

5 . The QAP is an ./VP-harcl optimization problem (§16.5.2). 

6. Exact solution of the QAP is limited to fairly small problems, generally of size 16 
or smaller. 

7 . A lower bound on completions of a partial assignment for the QAP is given by 

33 G,p(i) T 53 53 ,p(p) 

iEJ 7 pGJ 7 

A 33 33 \fipdp(i),p(p) + fpidp(j>),p(i)) 

ieFpfF 

+ 33 c i,p(i) A 33 33 fipdp(i),p(p)- 

i^J 7 i^J 7 p^J 7 

The first two terms above are the known fixed and interaction costs of assignments 
already made; the third term captures the interaction costs between assigned facilities 
and those yet to be assigned; and the last two terms represent the fixed and interaction 
costs of assignments not yet made. 

8. A minimum value z* can be calculated for the last three terms in the lower bound 
expression of Fact 7 by solving a LAP such that each cost term is a lower bound on the 
incremental costs that would be incurred if facility i ^ T is assigned to location j ^ C. 

9. Gilmore-Lawler lower bound: This lower bound for z QAP is given by 

33 G,p{i) t 53 33 fipdp(i),p(p) t z j 

iEJ 7 p^lT 

where z* is found as in Fact 8. 

10 . The Gilmore-Lawler lower bound allows the QAP to be solved using a branch-and- 
bound (implicit enumeration) technique (§15.1.8). 

11 . Alternative tighter bounds are available. However, considering the quality of these 
bounds and the effort involved in computing them, the Gilmore-Lawler lower bound 
still seems to be the most effective bound to use within a branch-and-bound scheme. 

12. There are several ways to linearize the QAP by defining additional variables and 
constraints. However, none of the linearizations proposed so far has proved to be com- 
putationally effective. 

13 . Heuristic methods for solving the QAP can be classified as limited enumeration, 
construction methods, improvement methods, hybrid methods, and metaheuristics. A 
survey of exact and heuristic solution methods for the QAP is found in [KuHe87]; 
experimental comparisons of heuristic approaches appear in [BuSt78] and [Li81] . 

14 . Limited enumeration: There are two distinct approaches for limiting the search for 
an optimal QAP solution using a branch-and-bound approach: 

• The search can be curtailed by placing a limit on the computation time or the 
number of subproblems examined. Since an optimal solution is often found 
fairly early in a branch-and-bound procedure, especially if a good branching 
rule is available, this approach may find an optimal (or a near-optimal) solution 
while saving on the significant cost of proving optimality. 
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• The gap between the lower and upper bound is largest at higher levels of a braiicli- 

and-bound tree. Thus a relatively large gap can be used to fathom subproblems 
at higher levels, and this gap can be decreased gradually as the search reaches 
lower levels of the tree. 

15 . Construction methods: These heuristics start with an empty assignment and add 
assignments one at a time until a complete solution is obtained. The rule used to choose 
the next assignment can employ: 

• a local view : select a facility having the maximum interaction with a facility 

already assigned; locate it to minimize the cost of interaction between facilities; 

• a global view: take into account assignments already made as well as future 

assignments to be made. 

16 . Suppose that k assignments have already been made. Using statistical properties, 
the expected value for the completion of the partial assignment is given by the following 
expression, whose terms are analogous to those in Fact 7: 

EV = ^2 + S E fipdp(i),p{p) 

iSLT ptzJ 7 

E E E ( fipdp(i),j + fpidj,p(i)) 

n — k 

E E (kj E fi P { E djq) 

_l_ i,p jT j,qjC 

n — k (n — k) (n — k — 1) 

The low computational requirements of computing EV make this a good choice to guide 
a construction heuristic [GrWh70]. 

17 . Improvement methods: These heuristics start with a suboptimal solution (often 
randomly generated) and attempt to improve it through partial changes in the assign- 
ments. Several important issues arise in designing an improvement heuristic: 

• type of exchange: The choices are pairwise, triple, or higher-order exchanges. 

The use of pairwise exchanges has been found to be the most effective in terms 
of solution quality and computational burden. Higher-order exchanges can be 
beneficial but are generally used in a limited way because of the significant 
increase in computation time. 

• scope of exchange: The procedure can use a local approach that considers only the 

exchange of adjacent facilities, or a global approach that considers all possible 
exchanges. Current computing capabilities allow the use of a global approach, 
which has been found to be more effective. 

• choice of exchange: The procedure can effect an exchange as soon as an improving 

move is found, or can evaluate all possible exchanges and choose the best. The 
first improvement option is more common. 

• order of evaluation: The possible exchanges can be evaluated in a random or some 

predetermined order. This is relevant only if the “first improvement” approach 
is used, as is often the case. One simple but effective solution is to consider 
facilities in the fixed order of decreasing total interactions, so that exchanges 
with potentially large savings are evaluated first. 

18 . Hybrid methods: Unlike improvement procedures, which tend to get trapped at 
local minima, hybrid methods use multiple restarts from a set of diversified solutions. 
Hybrid procedures combine the power of improvement routines with diversified solutions 
obtained through construction methods. 
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19 . Metaheuristics: In recent years, metaheuristics such as simulated annealing, tabu 
search, and genetic algorithms have been developed to help improvement procedures 
avoid the trap of local minima and have been applied with success to the QAP. Meta- 
heuristics have been able to find the best known solutions for the commonly used bench- 
mark problems in the literature and remain an active area of research on the QAP. 

20 . Computer codes (in Fortran and C) for heuristically solving the QAP can be found 
at the sites: 

http : //www.netlib. org/toms/608 
http : //www.netlib. org/toms/754 
http : //www.netlib. org/toms/769 

http : //rtm. science .unitn. it/~battiti/ archive/code/rts_qap/ 

Examples: 

1. The following table gives the data c^-, fij for a QAP with four facilities and four 
locations. 


fij 

1 

2 

3 

4 

1 

0 

1 

3 

4 

2 

1 

0 

2 

1 

3 

3 

2 

0 

3 

4 

4 

1 

3 

0 


Cij 

1 

2 

3 

4 

1 

1 

3 

2 

1 

2 

2 

1 

4 

3 

3 

4 

2 

4 

4 

4 

3 

1 

2 

2 


The fixed locations 1,2, 3, 4 occur at equally spaced points along a line, with unit dis- 
tances between successive points, so that dij = \i — j\. For the assignment p specified by 
p{ 1) = 1, p( 2) = 4, p( 3) = 2, p( 4) = 3 the fixed cost is cn + C24 + C32 + C43 = 8. Because 
the flows and distances are symmetric, the interaction cost is 2(7i2di4 + 7i3^i2 + 7i4^i3 + 
/23^42 + /24^43 + 734^23) = 44. The total cost of assignment p is then 8 + 44 = 52. 

2. The assignment in Example 1 can be improved by a pairwise exchange. Namely, 
instead of assigning facilities 1 and 2 (respectively) to locations 1 and 4, they are assigned 
to the interchanged locations 4 and 1. This gives er(l) = 4, cr(2) = 1, er(3) = 2, <r(4) = 3. 
Then the fixed cost incurred is C14 + C21 + C32 + C43 = 7 and the interaction cost is 
2(/i2d4i + /i3<i42 + 714^43 + /23^i2 + 724^13 + 734^23) = 40. The total cost 47 is lower 
than that for the assignment p in Example 1. In fact cr is an optimal QAP assignment. 

3. The QAP arises in designing the layout of a manufacturing facility. A number of 
products are to be made in this facility and different products require different operations 
in given sequences for completion. These operations are performed by n departments: 
e.g., turning, milling, drilling, heat treatment, and assembly. Knowing the sequence of 
operations and the volume of each product to be produced, it is possible to calculate the 
flow from any department i to another department j. There are n physical locations, 
with distance dij between locations i and j. The fixed cost of assigning department i 
to location j is c^-, representing the cost of building foundations and installing support 
equipment (cables, pipes) for the machines. Then the objective is to assign departments 
to locations in order to minimize the sum of fixed and interaction costs. 

4. The TSP (§10.7.1) can be formulated as a special case of the QAP, where the n cities 
correspond to locations and a position number (facility) in the tour is to be associated 
with each city. Let 7i2 = 723 = • • ■ = f n i = 1 and fij = 0 otherwise. The distance dij 
represents the cost of traveling between cities i and j, and let all fixed costs Cjj be zero. 
Then a solution to this QAP gives an optimal labeling of cities with their positions in 
an optimal TSP tour. 
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1 0.8 NETWORK REPRESENTATIONS AND DATA STRUCTURES 

To carry out network optimization algorithms efficiently, careful attention needs to paid 
to the design of the data structures supporting these algorithms. There are alternative 
ways to represent a network differing in their storage requirements and their efficacy 
in executing certain fundamental operations. These representations need to incorpo- 
rate both the topology of the underlying graph and also any quantitative information 
present in the network (such as cost, length, capacity, demand, or supply). Standard 
representations of networks, and trees in particular, are discussed in this section. 


10.8.1 NETWORK REPRESENTATIONS 

There are various ways to represent networks, just as there are various ways to rep- 
resent graphs (§8.1.4, §8.3.1). In addition it is necessary to incorporate quantitative 
information associated with the vertices and edges (or arcs) of the network. While the 
description here concentrates on directed networks, extensions to undirected networks 
are also indicated. 

Definitions: 

Let G = (V, E ) be a directed graph (§8.3.1) with vertex set V = {1, 2, . . . , n} and arc 
set E. Define m = \E\ to be the number of arcs in G. 

The adjacency set A(i) = { (i,j) | ( i,j ) £ E } for vertex i is the set of arcs emanating 
from i. (See §10.3.1.) 

The adjacency matrix for G is the 0-1 matrix Aq = (a© having a,;,- = 1 if (i,j) £ E 
and ciij = 0 if (i,j) /£E. (See also §8.3.1.) 

The arc list for G (see §8.3.1) can be implemented using two arc-length arrays FROM 
and TO: 

• For each arc (i,j) £ E there is a unique 1 < k < m satisfying FROM(fc) = i and 

TO (k)=j. 

• Arcs are listed sequentially in the FROM and TO arrays in no particular order. 

The linked adjacency list for G is given by a vertex-length array START and a 
singly- linked list ARCLIST of arc records: 

• START© points to the first record for vertex i in this list, corresponding to a 

specified first element of A(i). 

• Each arc (i,j) £ A(i ) has an associated arc record, which contains the fields 

TO and NEXT. Specifically, ARCLIST. TO gives the adjacent vertex j, and 
ARCLIST. NEXT points to the next arc record in A(i). If there is no such 
following record, ARCLIST. NEXT = null. 

The forward star for G is given by a vertex-length array START and an arc-length 
array TO, with the latter in one-to-one correspondence with arcs (i,j) £ E: 

• START© gives the position in array TO of the first arc leaving vertex i. 

• The arcs of A(i) are found in the consecutive positions START©, START© + 1, 

. . . , START(i + 1) — 1 of array TO. If arc (i. j) corresponds to position k of 
TO, then TO(fc) = j. 

• By convention, an additional dummy vertex n + 1 is added, with START(n+l) = 

m + 1. 
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Facts: 


1. An undirected graph can be represented by replacing each undirected edge (i,j) by 
two oppositely directed arcs (?', j) and (j,i). 

2 . The adjacency matrix, the arc list, the linked adjacency list, and the forward star 
are four standard representations of a directed (or undirected) graph. 

3 . The linked adjacency list and forward star structures are commonly used implemen- 
tations of the lists-of-neighbors representation (§8.3.1). 

4 . The following table shows the (worst-case) computational effort required to carry 
out certain fundamental operations on G: finding an arc, deleting an arc (once found), 
adding an arc, and scanning the adjacency set of an arbitrary vertex i. Here on = 
|A(i)| < n. 


representation 

find arc 

delete arc 

add arc 

scan A(i) 

adjacency matrix 

0(1) 

0(1) 

0(1) 

O(n) 

arc list 

O(to) 

0(1) 

0(1) 

0(to) 

linked adjacency list 

O(oi) 

0(1) 

0(1) 

O(oi) 

forward star 

0(ati ) 

0(n + to) 

0(n + m) 

0(a z ) 


5 . The storage requirements of the four representations are given in the following ta- 
ble for both directed and undirected graphs. For the last two representations, each 
undirected edge appears twice: once in each direction. 


representation 

storage 
( directed ) 

storage 

(undirected) 

exploit 
sparsity 1 

adjacency matrix 

n 2 

n 2 

2 

no 

arc list 

2 TO 

2 TO 

yes 

linked adjacency list 

n + 2 to 

n + 4 to 

yes 

forward star 

n + to . 

n + 2 to 

yes 


6. As seen in the table of Example 5, all representations other than the adjacency 
matrix representation can exploit sparsity in the graph G. That is, the storage require- 
ments are sensitive to the actual number of arcs and the computations will generally 
proceed more rapidly when G has relatively few arcs. 

7 . Quantitative data for network vertices (such as supply and demand) can be stored 
in an associated vertex-length array, thus supplementing the standard graph represen- 
tations. 

8. Quantitative data for network arcs (such as cost, length, capacity, and flow) can be 
accommodated as follows: 

• For the adjacency matrix representation, costs (or lengths) c,j can be imbedded in 
the matrix Aq itself. Namely, redefine Aq = (a© so that a,y = Cy if (i,j) G E, 
whereas a l3 is an appropriate special value if (i,j) /GE. For instance, in the 
shortest path problem (§10.3.1), a t j = oo can be used to signify that (i,j) /gE. 
Additional n x n arrays would be needed however to represent more than one 
type of arc data. 
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• For the arc list representation, additional arrays parallel to the arrays FROM 

and TO can be used to store quantitative arc data. 

• For the linked adjacency list representation, additional fields within the arc record 

can be used to store quantitative arc data. 

• For the forward star representation, additional arrays parallel to the array TO 

can be used to store quantitative arc data. 

9 . The arc list representation is best suited for arc-based processing of a network, such 
as occurs in Kruskal’s minimum spanning tree algorithm (§10.1.2). 

10 . The arc list representation is a convenient form for the input of a network to 
an optimization algorithm. Often this external representation is converted within the 
algorithm to a more suitable internal representation (linked adjacency list or forward 
star) before executing the steps of the optimization algorithm. 

11 . The linked adjacency list and forward star representations are best suited to car- 
rying out vertex-based explorations of a graph, such as a breadth- first search or a 
depth-first search (§9.2.1). It is also ideal for carrying out Prim’s minimum spanning 
tree algorithm (§10.1.2) as well as most shortest path algorithms (§10.3.2). 

12 . Especially in the case of undirected graphs, the linked adjacency list and forward 
star representations can be enhanced by use of an additional arc-length array MIRROR. 
The array MIRROR allows one to move from the location of arc (i-j) to the location 
of arc (j,i) in constant time. 

13 . The linked adjacency list is typically used when the structure of the graph can 
dynamically change (as by addition/deletion of arcs or vertices). On the other hand, 
the forward star representation is appropriate for static graphs, in which the graph 
structure does not change. 

Examples: 

1. A directed graph G with 5 vertices and 8 arcs is shown in the following figure. 



The 5x5 adjacency matrix for G is given by 


Ag 



1 

2 

3 

4 

5 

1 

(o 

1 

1 

0 

°\ 

2 

0 

0 

1 

1 

0 

3 

0 

0 

0 

0 

0 

4 

1 

0 

1 

0 

1 

5 

\0 

0 

1 

0 

0/ 
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2. An arc list representation of the directed graph in the figure of Example 1 is given 
in the following table: 


FROM 

1 

2 

1 

4 

2 

4 

4 

5 

TO 

2 

4 

3 

1 

3 

5 

3 

3 


3. The following figure shows a linked adjacency list representation of the directed 
graph of Example 1. The symbol © is used to indicate a null pointer. 


1 

2 

3 

4 

5 


START NEXT 

-> H 2 | »H H3 Is | 

4 H 4 | » ! ■ - H 3 | B~| 

■ H 3 1 •-! — H s 1 «-4 






© 




■* H 1 I I 


4. The following figure shows a forward star representation of the directed graph in Ex- 
ample 1. Since A(3) = 0, it is necessary to set START(3) = START(4) = 5. For example, 
the arcs in A( 4) are associated with positions [START(4), . . . , START(5) — 1] = [5, 6, 7] 
of the TO array. Similarly, the single arc emanating from vertex 5 is associated with 
position [START (5), . . . , START (6) - 1] = [8] of the TO array. 


START 


TO 


10.8.2 TREE DATA STRUCTURES 


Since trees are important objects in optimization problems, as well as useful data struc- 
tures in their own right (see §9.1), additional representations and features of trees are 
given here. 

Definitions: 

If T is a rooted tree with root r (§9.1.2), then the predecessor function pred: V — > V 
is defined by pred(r ) = 0, and pred(j) = i if vertex i is the parent of j in T. (See 
§10.3.1.) 
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The principal subtree Tj rooted at vertex j is the subgraph of T induced by all 
descendants of j (including j). (See §9.1.2.) 

The cardinality card(j) of vertex j in a rooted tree T is the number of vertices in its 
principal subtree Tj. 

The least common ancestor least(i,j) of vertices i and j in a rooted tree is the vertex 
of largest depth (§9.1.2) that is an ancestor of both i and j. 

Facts: 

1. A rooted tree is uniquely specified by the mapping pred(-). 

2. In a rooted tree with root r, depthir ) = 0 and depth(j) = 1 + depth(pred(j)) for 
j ± r. 

3. In a rooted tree, height(i ) = 0 if * is a leaf. If i is not a leaf, then height(i) = 
1 + max{ height(j) \ j a child of i }. 

4. In a rooted tree, card(i ) = 1 if i is a leaf. If i is not a leaf, card(i ) = 1 + ]T){ card(j) \ 
j a child of i } . 

5. The predecessor, depth, height, and cardinality of a rooted tree T can all be calcu- 
lated while carrying out a preorder or postorder traversal (§9.1.3) of T: 

• The predecessor and depth can be calculated while advancing from the current 

vertex to an unvisited vertex. 

• The height and cardinality can be updated when retreating from a vertex all of 

whose children have been visited. 

6. The depth of a vertex is a monotone increasing function on each path from the root. 

7. The height and cardinality are monotone decreasing functions on each path from 
the root. 


Examples: 

1. The following figure shows a tree T rooted at vertex 1. The vertices have been num- 
bered according to a preorder traversal of T. The following table gives the predecessor, 
depth, height, and cardinality of each vertex of T. 
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Algorithm 1: Least common ancestor. 

input: rooted tree T, vertices i and j 
output: least common ancestor of i and j 

procedure feast© j) 
while i ^ j do 

if depth(i) > dept.h(j) then i := predli) 
else if depth{i) < depth(j) then j := pred(j) 
else i := predii ), j := pred(j) 

return i 


vertex 

1 

2 

3 

4 

5 

6 

7 

8 

9 

pred 

0 

1 

2 

2 

1 

5 

6 

6 

1 

depth 

0 

1 

2 

2 

1 

2 

3 

3 

1 

height 

3 

1 

0 

0 

2 

1 

0 

0 

0 

card 

9 

3 

1 

1 

4 

3 

1 

1 

1 


2. Certain applications (such as cycle detection in the network simplex algorithm, 
§10.5.2) require finding the least common ancestor least(i,j) of vertices i and j in a 
rooted tree. 

3. The calculation of least{i,j) can be carried out efficiently, in 0(n) time, by using 
Algorithm 1. 

4. Algorithm 1 is based on Fact 6 and employs two auxiliary data structures, the 
predecessor and depth functions. It repeatedly backs up from a vertex of larger depth 
until the least common ancestor is found. 
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ftp://dimacs.rutgers.edu/pub/netflow/matching/ (Computer code in C for an 
algorithm for the weighted matching problem; computer code in C, Pascal, and 
Fortran for algorithms for maximum size matchings in nonbipartite networks.) 

ftp://dimacs.rutgers.edu/pub/netflow/maxflow/ (Computer code in Fortran for 
solving maximum flow and minimum cut problems.) 
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ftp://dimacs.rutgers.edu/pub/netflow/mincost/ (Computer code in Fortran for 
solving the minimum cost flow problem.) 

ftp://ftp.zib.de/pub/Packages/mathprog/matching/ (Computer code in C for an 
algorithm for the weighted matching problem; computer code in C and Fortran for 
algorithms for maximum size matchings in nonbipartite networks.) 

ftp://ftp.zib.de/pub/Packages/mathprog/mincut/ (Computer code in C for solv- 
ing minimum cut problems.) 

ftp : //ftp . zib . de/pub/Packages/mathprog/netopt-bertsekas/ (Computer code in 
Fortran for implementing the label-correcting algorithm and Dijkstra’s algorithm for 
shortest paths; computer code in Fortran for solving maximum flow, minimum cut, 
and minimum cost flow problems.) 

http://dmawww.epfl.ch/~rochat/rochat_data/solomon.html (Data sets and soft- 
ware for vehicle routing problems with time windows.) 

http://orlyl.snu.ac.kr/software/ (Computer code in C, Pascal, and Fortran for 
implementing Dijkstra’s algorithm for shortest paths and the Floyd-Warshall algo- 
rithm; computer code in C, Pascal, and Fortran for solving maximum flow, minimum 
cut, and minimum cost flow problems.) 

http : //rtm. science .unitn . it/~battiti/ archive/code/rts_qap/ (Computer code 
in Fortran and C for heuristically solving the QAP.) 

http://www.cs.sunysb.edu/~algorith/ (The Stony Brook Algorithm Repository; 
see Sections 1.4 and 1.5 on Graph Problems.) 

http://www.geocities.com/ResearchTriangle/72T9/vrp.html (Data sets, soft- 
ware, and research papers on vehicle routing problems.) 

http : //www. ing.unlp . edu . ar/ cetad/mos/TSPBIB Thome .html (Software, research pa- 
pers, and other heuristic approaches for the traveling salesman and related prob- 
lems.) 

http : //www. iwr.uni-heidelberg.de/ iwr/ comopt/soft/TSPLIB95/TSPLIB .html (A 
library of sample problems related to the traveling salesman problem, with their best 
known solutions.) 

http : / / www . mat . uc . pt/~eqvm/ cient if icos/f ortran/ codigos . html (Fortran code 
for implementing Kruskal’s algorithm and Prim’s algorithm for minimum spanning 
trees; Fortran code for implementing the label-correcting algorithm and Dijkstra’s 
algorithm for shortest paths.) 

http : //www.neci .nj .nec . com/homepages/ avg/ soft/ soft .html (Computer code for 
implementing the label-correcting algorithm and Dijkstra’s algorithm for shortest 
paths; computer code for solving maximum flow, minimum cut, and minimum cost 
flow problems.) 

http : //www.netlib . org/toms/479 (Fortran code for implementing Prim’s algorithm.) 

http://www.netlib.org/toms/562 (Fortran code for implementing the label-correct- 
ing algorithm for shortest paths.) 
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http://www.netlib.org/toms/608 (Fortran code for heuristically solving the QAP.) 

http : //www.netlib . org/toms/613 (Fortran code for implementing Prim’s algorithm.) 

http://www.netlib.org/toms/750 (Fortran code for solving the TSP.) 

http://www.netlib.org/toms/754 (Fortran code for heuristically solving the QAP.) 

http://www.netlib.org/toms/769 (Fortran code for heuristically solving the QAP.) 

http://www.zib.de/Optimization/Software/Mcf/ (Computer code for solving the 
minimum cost flow problem.) 
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INTRODUCTION 


Partially ordered sets play important roles in a wide variety of applications, includ- 
ing the design of sorting and searching methods, the scheduling of tasks, the study of 
social choice, and the study of lattices. This chapter covers the basic concepts involv- 
ing partially ordered sets, the various types of partially ordered sets, the fundamental 
properties of these sets, and their important applications. 

A table of notation used in the study of posets is given following the glossary. 


GLOSSARY 

antichain: a subset of a poset in which no two distinct elements are comparable. 

atom: in a poset, an element of height 1. 

atomic lattice: a lattice such that every element is a join of atoms (or equivalently, 
such that the atoms are the only join- irreducible elements). 

auxiliary graph (of a simple graph G ): the graph G' whose vertices are the edges 
of G, with vertex ei adjacent to vertex e-i in G' if and only if e\ and e 2 are adjacent 
edges in G, but do not lie on a 3-cycle in G. 

biorder representation (on a digraph D): a pair of real- valued functions /, g on the 
vertex set Vp such that u — * v is an arc if and only if f(u) > g{v). 

bipartite poset: a poset of height at most 2. 

Boolean algebra: the poset whose domain is all subsets of a given set, partially 
ordered by inclusion. 

Borda consensus function (on a set of social choice profiles): the consensus function 
that ranks the alternatives by their Borda count. 

Borda count (of an alternative social choice x): the sum, over all individual rankings, 
of the number of alternatives x “beats” . 

bounded poset: a poset with both a unique minimal element and a unique maximal 
element. 

u,v-bypass (in a directed graph): a iqu-path of length at least two such that there is 
also an arc from u to v. 

Cartesian product (of two posets P = ( A , R) and P' = (A', R .')): the poset P x P = 
( A x A', S'), such that (x,x')S(y,y') if and only if xRy and x'R’y’. 

chain: a subset of a poset in which every two elements are comparable. 

k-chain : a chain of size fc, i.e. , a chain on k elements. 

chain-product : the Cartesian product of a collection of chains. 

comparability digraph (of a poset (A, i?)): the simple digraph whose vertex set is 
the domain X and which has an arc from x to y if and only if x < y. 

comparability graph (of a poset (X, R)): the simple graph whose vertex set is the 
domain A' and which has an edge joining distinct vertices x and y if x < y. 

comparability invariant (for posets): an invariant / such that f(P) = f(Q) when- 
ever posets P and Q have the same comparability graph. 
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comparable elements (in a poset (X, R)): elements x and y such that either (x, y) £ 
R or ( y , x) € R. 

consecutive chain (in a ranked poset): a chain whose elements belong to consecutive 
ranks. 

consensus function (on a set of social choice profiles): a function that assigns to 
each possible profile P = { Pi \ i £ I } on a set of alternatives a linear ordering (ties 
allowed) of those alternatives. 

consensus ranking (on a set of social choice profiles): the linear ordering of the 
alternatives assigned by the consensus function. 

cover graph (of a poset ( X , R)): the graph with vertex set X and edge set consisting 
of the pairs satisfying the cover relation. 

cover relation (of a poset (X, R)): the relation on X consisting of the pairs (x, y) such 
that x > y in R and such that there is no “intermediate” element 2 with x > z > y. 

cover diagram: a synonym for the Hasse diagram. 

critical pair (in a poset): an ordered incomparable pair that cannot be made compa- 
rable by adding any other single incomparable pair as a relation. 

dependent edge (in an acyclic directed graph): an arc from u to v such that the 
graph contains a u,i>-bypass. 

dimension (of a poset): the minimum number of chains in a realizer of the poset. 

distributive lattice: a lattice in which the meet operator distributes over the join 
operator, so that x A (y V z) = (x A y) V (2; A z) for all x , y, z. 

divisor lattice: the poset D(n ) of divisors of n, in which x < y means that y is an 

integer multiple of x. 

down-set (in a poset): a subposet I such that if x £ I and if y < x, then y € I, also 
called an ideal. 

dual (of a poset P = (X, R)): the poset P* = (X, S) such that x < y in S if and only 
\i y < x in R. 

extension (of a poset P = (X, R)): a poset Q = (X, S') such that R, C S; meaning 
that xRy implies xSy. 

k-family (in a poset): a subposet containing no chain of size k + 1. 

Ferrers digraph: a digraph having a biorder representation. 

filter (generated by an element x in a poset P): the up-set U[x\ = {y £ P \ y > x}. 

filter (generated by a subset in a poset P): given a subset A of P, the up-set U[A\ = 

filter (in a poset): a subposet whose domain is the set-theoretic complement of the 
domain of an ideal. 

forbidden subposet description (of a class of posets): a characterization of that 
class as the class of all posets that does not contain any of the posets in a specified 
collection. 

geometric lattice: an atomic, upper semimodular lattice of finite height. 

greatest lower bound (of elements x and y in a poset): a common lower bound z 
such that every other common lower bound z' satisfies the inequality 2 > z' . Such 
an element, if it exists, is denoted x A y. 
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graded poset: a poset in which all maximal chains have the same length. 

Hasse diagram (of a poset): a straight-line drawing of the cover graph in the plane 
so that the lesser element of each adjacent pair is below the greater. 

height (of a poset): the maximum size of a chain in that poset. 

height (of an element a; of a poset): the maximum length h(x) of a chain that has x 
as its maximal element. 

ideal (generated by an element x in a poset P): the down-set D[x] = { y € P \ y < x }. 

ideal (generated by a subset A in a poset P): the down-set D[A\ = \J xeA D[ 4 

ideal (in a poset): a subposet I such that if x € I and if y < x, then y e I. 

incomparability graph (of a poset P ): the edge-complement of the comparability 
graph G(P). 

incomparable pair (in a poset (X,R)): a pair x,y £ X such that neither x < y nor 
y < x in R. 

integer partition: a nonincreasing nonnegative integer sequence having finitely many 
nonzero terms, with trailing zeros added as needed for comparison. 

intersecting family: a collection of subsets of a set such that every pair of members 
has nonempty intersection. 

intersection (of partial orderings P = ( X,R ) and Q = (X,S) on the set X): the 
poset ( X , R fl S) that includes the comparisons in both. 

intersection (of posets (X,R) and (X,S)): the poset (X, RnS). 

interval (in a poset): the subposet which contains all elements z such that x < z < y. 

interval order : a poset in which there is an assignment to its members of real intervals, 
such that x < y if and only if the interval for y is totally to the right of the interval 
for x. 

interval representation (of a poset P): a collection of real intervals corresponding 
to an interval order for P. 

isomorphic (posets): posets P = (X,R) and Q = ( Y,S ) such that there is a poset 
isomorphism P Q. 

isomorphism (of lattices): an order-preserving bijection from one lattice to another 
that also preserves greatest lower bounds and least upper bounds of pairs. 

isomorphism (of posets): a bijection from one poset to another that preserves the 
order relation. 

join : given {x,y}, another name for the least upper bound x V y. 

join-irreducible element (in a lattice): a nonzero element that cannot be expressed 
as the join of two other elements. 

Jordan-Dedekind chain condition: the condition for a poset that every interval has 
finite length. 

lattice : a poset in which every pair of elements has both a greatest lower bound and 
a least upper bound. 

lattice (of bounded sequences): the set L(m, n) of length-m real sequences a\,...,a n 
such that 0 < a\ < . . . < a n < n. 

lattice (of order ideals in a poset P = (X,R)): the set J(P) of order ideals of P, 
ordered by inclusion. 
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least upper bound (of elements x and y in a poset): a common upper bound 2 such 
that every other common upper bound z' satisfies the inequality z' > 2. Such an 
element, if it exists, is denoted xV y. 

length (of a chain): the number of cover relations in the chain; in other words, one 
less than the number of elements in the chain. 

length (of a poset): the length of a longest chain, which is one less than the height of 
that poset. (Sometimes height is used synonymously with length.) 

lexicographic ordering (of the Cartesian product of posets): the ordering for the 
Cartesian product of the domains in which (#1,2:2) < (z/i , 2/2) if and only if xi < y\ 
or X\ = yi and X2 < yi\ this is not the usual ordering of the Cartesian product of 
posets. 

linear extension (of a poset): an extension of the poset that is a chain. 

linear order: See total order. 

linearly ordered set: a poset in which every pair of elements is comparable. 

linear sum (of two disjoint posets P and P'): the poset in which all the elements of 
poset P lie “below” all those of poset P'. 

locally Unite poset: a poset in which every interval is finite. 

lower bound (of elements x and y in a poset): an element 2 such that x > z and 
y>z. 

lower semimodular lattice: a lattice whose dual is upper semimodular. 

majority rule property (for a consensus function): the property that it prefers x 
to y if and only if a majority of the individuals prefer x to y. 

maximal element (in a poset): an element such that no other element is greater. 

meet (of elements x and y): another name for the greatest lower bound x A y. 

meet-irreducible element (of a lattice): a nonzero element that cannot be expressed 
as the meet of two other elements. 

minimal element (in a poset): an element such that no other element is less. 

minimum realizer encoding (of a poset): a poset that lists for each element its 
position on each extension in a minimum realizer. 

modular lattice: a lattice in which x A (y V 2) = (a: A y) V 2 for all x, y, z such that 
2 < x. 

module (in a graph G): a vertex subset U C Vq such that each vertex outside U is 
adjacent to all or none of the vertices in U. 

k-norm (of a sequence a = {a^}): the sum y)min{fc, aj}, whose value is commonly 
denoted nik{a). 

k-norm of a chain partition: the fc-norm of its sequence of chain sizes. 

normalized matching property (for a graded poset): the property that for every 
rank k and every subset A of rank P^, the set A* of elements in the rank Pk+i that 
are comparable to at least one element of A satisfies the inequality ^ ^ ^ . 

order module in a poset: a set S of elements such that every element outside S is 
above all of S, below all of S, or incomparable to all of S. 

order-preserving mapping (from poset P = (X, P) to poset Q = ( Y. S)): a function 
f:X—>Y such that f(x) < f(y) whenever x < y in P. 
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order relation (on a set X): a relation R such that ( X , R) is a partially ordered set. 

partially ordered set : a pair P = (X, R) consisting of a set X and a relation R that 
is reflexive, antisymmetric, and transitive. 

partition lattice : the poset H n of partitions of the set [n] = {1, . . . ,n}, where tt < a 
if 7r is a refinement of a. 

permutation graph: a graph whose vertices can be placed in 1-1 correspondence with 
the elements of a permutation of [?r] = {1 such that Vi is adjacent to Vj if 

and only if the larger of {i,j} comes first in the permutation. 

planar poset: a poset with a Hasse diagram that has no edge-crossings. 

plurality consensus function (on a set of social choice profiles): the consensus func- 
tion in which the winner(s) is(are) the alternative(s) appearing in the greatest num- 
ber of top ranks, after which the winner(s) is(are) deleted and the procedure is 
repeated to select the next rank of the consensus ranking, etc. 

poset: a partially ordered set. 

profile (on a set of alternative social choices): a set P = { P,; | i € I } of linear rankings 
(ties allowed) of the alternatives, one for each member of a set / of “individuals” 
participating in the decision process. 

quasi-transitive orientation (on a simple graph G): an assignment of directions to 
the edges of G so that whenever there is an xy - arc and a yz- arc, there is also an arc 
between x and 2 . 

rank (of a graded poset): the length of any maximal chain in the poset. 

rank function (on a poset): an integer- valued function r on the elements of the poset 
so that “y covers x" implies that r(y) = r(x) + 1. 

ranked poset: a poset having a rank function. 

kth rank of a ranked poset: the subset !)■ of elements for which r(x) = k. 

rank parameters (of a subset F of elements in a ranked poset P): the numbers 
fk = |PnP fc |. 

ranking: a poset P whose elements are partitioned into ranks P\,...,Pk such that 
two elements are incomparable in the poset if and only if they belong to the same 
rank. 

realizer (of a poset P): a set of linear extensions of P whose intersection is P. 

refinement (of a set partition er): replacement of each block B £ <7 by some partition 
of B. 

regular covering (of a poset by chains): a multiset of maximal chains such that for 
each element x the fraction of the chains containing x is ^ , where N r ^ is a 

Whitney number. 

self-dual poset: a poset isomorphic to its dual. 

semimodular lattice: an upper semimodular lattice. 

semiorder : a poset on which there is a real- valued function / and a real number <5 > 0 
such that x < y if and only if f(y) — f(x) > 5. 

shadow (of a family of sets F): the collection of sets containing every set that is 
obtainable by selecting a set in F and deleting one of its elements. 

size (of a finite poset): the number of elements. 
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Sperner property (for a graded poset): the property that some single rank is a 
maximum antichain. 

k-Sperner property (for a graded poset): the property that the poset has a maximum 
fc-family consisting of k ranks. 

standard k-chain: the poset { 1 , . . .,n}, under the usual ordering of the integers, 
written k. 

standard example of an n-dimensional poset: the subposet S n of the Boolean 
algebra 2" induced by the singletons and their complements. 

strict Sperner property : the property of a graded poset that all maximum antichains 
are single ranks. 

strong Sperner property: the property that a graded poset is fc-Sperner for all 
k < r{P). 

Steinitz exchange axiom: for a closure operator a: 2 E — » 2 s , the rule that p cr(A) 
and p € cr(A U q) imply q £ a(A U p). 

sublattice (of a lattice): a subposet that contains the meet and join of every pair of 
its elements. 

submodular height function (in a lattice): a height function h such that h(xAy) + 
h(x V y) < h(x) + h(y) for all x , y. 

subposet (of a poset ( X , R)): a poset (Y, S) such that Y C X and S = R fl (Y x Y). 

subset lattice: the Boolean algebra 2”, that is, the Cartesian product of n copies of 
the standard 2-chain. 

subspace lattice: the set L n (q ) of subspaces of an n-dimensional vector space over a 
g-element held, partially ordered by set inclusion. 

symmetric chain (in a ranked poset P): a chain that has an element of rank r(P) — k 
whenever it has an element of rank k. 

symmetric chain decomposition (of a ranked poset): a partition of that poset into 
symmetric consecutive chains. 

symmetric chain order: a poset with a symmetric chain decomposition. 

topological ordering (of an acyclic digraph): a linear extension of the poset it rep- 
resents. 

topological sort: an algorithm that arranges the elements of a partially ordered set 
into a total ordering that is compatible with the original partial ordering. 

total order (of a set): an order relation in which each pair of distinct elements is 
comparable. 

transitive orientation (on a simple graph): an assignment of directions to the edges 
of a simple graph G so that whenever there is an xy- arc and a yz- arc, there is also 
a xz- arc. 

triangular chord (for a walk xi,...,Xk in an undirected graph): an edge between 
vertices Xi-i and Xi+ 1 , two apart on the walk. 

upper bound (of elements x and y in a poset): an element z such that x < z and 
y < z. 

upper semimodular lattice: a lattice in which whenever x covers x Ay, it is also 
true that x\/ y covers y. 

up-set (in a poset): a filter. 
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weak order: a ranking, i.e., a poset P whose elements are partitioned into ranks 
Pi, . . . , Pk such that two elements are incomparable if and only if they belong to the 
same rank. 

kth Whitney number (of a ranked poset P): the cardinality \Pk\ of the kth rank; 
written Nk(P). 

width (of a poset): the maximum size of an antichain in the poset. 

Young lattice : the lattice of integer partitions under component-wise ordering. 


poset notation 

notation 

meaning 

y>x 

x<y 

x < y 

x < y and x ^ y 

x\\y 

x ftyy and y f£x 

0 

minimal element in a bounded poset 

1 

maximal element in a bounded poset 

[x, y ] 

the interval {z\x<z<y} 

k 

standard fc-chain 

P 1 + P 2 

disjoint union of posets 

Pi ©P 2 

linear sum of two posets 

Pi x P 2 

Cartesian product of two posets 

pn 

iterated Cartesian product of copies of P 

p* 

dual of poset P 

D (n) 

divisibility poset of the integer n 

r(P) 

rank of a graded poset P 

JVfc(P) 

kth Whitney number (= cardinality of kth rank) of P 

w(P) 

width of P (= maximum size of an antichain) 

D \x } 

down-set (ideal) { y \y < x } 

D(x) 

clown-set (ideal) { y \y < x } 

U[x\ 

up-set (filter) { y \y > x } 

U(x) 

up-set (filter) { y \ y > x } 

x V y 

lub of x and y 

x Ay 

gib of x and y 


1 1 .1 BASIC POSET CONCEPTS 


11.1.1 COMPARABILITY 

The integers and real numbers are totally ordered sets, since every pair of distinct 
elements can be compared. In a partially ordered set, some pairs of elements may be 
incomparable. For example, under the containment relation, the sets {1,2} and {1,3} 
are incomparable. 


© 2000 by CRC Press LLC 




Definitions: 


A partial ordering (or order relation) R on a set A is a binary relation that is: 

• reflexive: for all x £ S, xRx\ 

• antisymmetric: for all x,y £ S, if xRy and yRx, then x = y\ 

• transitive: for all x,y, z £ S, if xRy and yRz, then xRz. 

Note: x < y or x <p y are often written in place of xRy or (x, y ) £ R. Also, y > x 
means x < y. The notation A is sometimes used in place of <. See the table following 
the Glossary for further poset notation. 

A partially ordered set (or poset ) P = (A, R) is a pair consisting of a set A, called 
the domain , and a partial ordering R on A. Writing x £ P means that x £ X. The 
notation (A, <) is often used instead of (A, R) to designate a poset. 

The size of a finite poset P is the number of elements in the domain. 

A totally ordered set (or linearly ordered set) is a poset in which every element 
is comparable to every other element. 

The elements x and y are comparable ( related ) in P if either x < y or y < x (or 
both, in which case x = y). 

The elements x and y are incomparable ( unrelated ) if they are not comparable. 
Writing x || y indicates incomparability. 

Element x is less than element y, written x < y, if x < y and x ^ y. (The notation A 
is sometimes used in place of <.) 

Element x is greater than element y , written x > y, if x > y and x yf y. 

An element £ of a poset is minimal if the poset has no element less than x. 

An element a: of a poset is maximal if the poset has no element greater than x. 

A poset is bounded if it has both a unique minimal element (denoted “0”) and a unique 
maximal element (denoted “1”). 

The comparability digraph D(P) of a poset P = (A, It) is the digraph with vertex 
set A, such that there is an arc from x to y if and only if x < y. 

The comparability graph G(P) of a poset P = (A, R) is the simple graph with vertex 
set A, such that xy £ Eg if and only if x and y are comparable in P, where x yf y. 

The incomparability graph of a poset P is the edge-complement of the comparability 
graph G(P). 

The induced poset of an acyclic digraph D is the poset whose elements are the vertices 
of D and such that x < y if and only if there is a directed path from x to y. 

The element y covers the element x in a poset if x < y and there is no intermediate 
element z such that x < z < y. 

The cover graph of poset P is the graph with vertex set A such that x and y are 
adjacent if and only if one of them covers the other in P. 

A Hasse diagram (or cover diagram or diagram ) of poset P is a straight-line draw- 
ing of the cover graph in the plane such that the lesser element of each pair satisfying 
the cover relation is lower in the drawing. 

A poset is planar if it has a Hasse diagram without edge-crossings. 
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A subposet of P = ( X , <) is a subset Y C X with the relation x < y in Y if and only 
if x < y in X. 

The interval [x, y] in poset P is the subposet that contains all elements z such that 

x < z < y. 

A poset P is locally finite if every interval in P has finitely many elements. 

An order-preserving mapping from poset P = (X, </>) to poset Q = ( Y, <q) is a 
function f:X —> Y such that x <p x' implies f(x) <q /( x'). 

An isomorphism of posets P = ( X , 11) and Q = ( Y, S) is a bisection /: X — > Y that 
preserves the order relation: whenever x± <p X 2 , then f{x 1 ) <q f{x 2 ). 

Isomorphic posets are posets P = ( X , R) and Q = (' Y , S) such that there is a poset 
isomorphism P — > Q. This is sometimes indicated informally by writing P = Q. 

Poset Q = ( Y. S) is contained in (or imbeds in) poset P = (X, R) if Q is isomorphic 
to a subposet of P. 

A poset P is Q-free if P does not contain a poset isomorphic to Q. 


Facts: 

1. Every finite nonempty poset has a minimal element and a maximal element. 

2. The comparability digraph D(P) of a poset P is an acyclic digraph. 

3. The minimal elements of a poset P induced by a digraph D are the sources of D\ 
that is, they are the vertices at which every arc points outward. 

4. The maximal elements of a poset P induced by a digraph D are the sinks of D\ that 
is, they are the vertices at which every arc points inward. 

5. The element y covers the element x in a poset P induced by a digraph D if and only 
if there is an arc in digraph D from x to y and there is no other directed path from x 
to y. 

6. Suppose that the poset P is induced from an acyclic digraph D. Then the compa- 
rability digraph of P is the transitive closure of D. 

7. Two different posets cannot have the same Hasse diagram, but they may have the 
same cover graph or the same comparability graph. 

8. There is a polynomial-time algorithm to check whether a graph G is a comparability 
graph, but the problem of deciding whether there exists a poset for which G is the cover 
graph is NP-complete. 


Examples: 

1. Any collection of subsets of the same set forms a poset when the subsets are partially 
ordered by the usual inclusion relation XCF. 

2. Boolean algebra: The Boolean algebra on a set X is the poset consisting of all the 
subsets of X, ordered by inclusion. 
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3. The Boolean algebra on the set {a, b, c} has the following Hasse diagram. The only 
maximal element is {a, b, c}. The only minimal element is 0. 


{a,b,c} 



IX X 



4. There are five different isomorphism types of posets of size three, whose Hasse 
diagrams are as follows. 

f a 


a a b 



5. There are 16 different isomorphism types of posets of size four, whose Hasse diagrams 
are as follows. 



b d 



E-A Y Y Y Y 



• a 

<» b 

• d 


6. Divisibility poset: The divisibility poset on the set / of positive integers, de- 
noted D{I), has the relation x < y if y is an integer multiple of x. A number y 
covers a number x if and only if the quotient - is prime. 

7. The set D(n ) of divisors of n forms a subposet of D(I), for any positive integer n. 
The set D(n) is identical to the interval [1, n) in D(I). For instance, the following figure 
is the Hasse diagram of D( 24) = [1, 24], 



8. The interval [3,30] in D(I) has domain {3,6,12,15,30}. The interval [2,24] has 
domain {2,4,6,8,12,24}. 
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9 . The poset D(I) is infinite, but locally finite. 

10 . The Boolean algebra of all subsets of an infinite set is not a finite poset. Nor is it 
locally finite, since each interval from a finite set to an infinite superset is infinite. 


11 . The poset D( 6) is isomorphic to the poset of subsets of {a, &}: 



12 . To generalize Example 11, if pi,...,p n are distinct primes, then the divisibility 
poset D(pi . . ,p n ) is isomorphic to the poset of subsets of a set of n objects. 


13 . The partitions of a set form a poset under refinement, as illustrated for the set 
{a, 6, c, d}. A notation such as ab-c-d means a partition of the set {a, b, c, d} into the 
subsets {a, b}, {c}, {d}. 


a-b-c-d 



14 . The 6-cycle is the comparability graph of exactly one (isomorphism type of) poset, 
which has the following Hasse diagram: 

>04 


15 . The 6-cycle is the cover graph of seven posets, all of which are planar. They have 
the following Hasse diagrams: 




A 
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1 1 .1 .2 CHAINS, ANTICHAINS, POSET OPERATIONS 


Definitions: 

A chain is a subset S of mutually comparable elements of a poset P, or sometimes the 
subposet of P formed by such a subset. 

The length of a finite chain C is \C\ — 1, i.e. , the number of edges in the Hasse diagram 
of that chain, regarded as a poset. 

A k-chain is a chain of size k, i.e., a chain on k elements. 

The standard k-chain fc is a fixed fc-chain, presumed to be disjoint from other objects 
in the universe of discourse. 

The height of a poset P is the maximum size of a chain in P. 

The height of an element x in a poset P is the maximum length h(x) of a chain in P 
that has x as its maximal element. 

A bipartite poset is a poset of height at most 2. 

A chain-product (or grid) is the Cartesian product of a collection of chains. 

An antichain (or clutter or Sperner family) is a subset S of pairwise incomparable 
elements of a poset P, or sometimes the subposet of P formed by such a subset. 

A chain or antichain is maximal if it is contained in no other chain or antichain. 

A chain or antichain in a finite poset is a maximum chain or antichain if it is one of 
maximum size. 

The disjoint union of two posets P = (X, R) and P' = ( X R’) with X fl X' = 0 is 
the poset (X U X', R U R'), denoted P + P'. 

The linear sum of two posets P = (X, R) and P' = (X', R’) with X fl X' = 0 is the 
poset (X U X', R U R' U (X x X')), denoted P ® P' . (This puts all of poset P “below” 
poset P'). 

The Cartesian product P x P' (or direct product or product) of two posets P = 
(X, R) and P' = (X',P') is the poset (X x X',S), such that (x,x’)S(y,y') if and only 
if xRy and x' R!y' . 

The iterated Cartesian product of n copies of a poset P = (X, <), written P", is the 
set of n-tuples in P, such that (xi, . . . , x n ) < (yi, . . . , y n ) if and only if either Xj < y-j , 
for j = 1, . . . , n. 

The lexicographic ordering of the Cartesian product Pi x P 2 of the domains of two 
posets is the partial ordering in which (£ 1 , 2 : 2 ) < (yi , 2 / 2 ) if and only if x\ < ij\ , or 
xi = yi and x 2 < y 2 - 

The dual of a poset P, denoted P*, is the poset on the elements of P defined by the 
relation y <p * x if and only if x <p y. 

A self-dual poset is a poset that is isomorphic to its dual. 

Facts: 

1. Every fc-chain is isomorphic to the linear sum l©l®---®loffc copies of 1. 

2. Every antichain of size fc is isomorphic to the disjoint union 1 + 1 + • • • + 1 of fc copies 
of 1. 

3. The chains are characterizable as the class of (1 + l)-free posets. 
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4 . The cover graph of a chain is a path. 

5 . The comparability graph of a chain of size n is the complete graph K n . 

6. The antichains are the class of 2-free posets; 

7 . The comparability graph of an antichain has no edges. 

8. The maximum size of a chain in a finite poset P equals the minimum number of 
antichains needed to cover the elements of P, that is, the minimum number of antichains 
whose union equals the domain of poset P. 

9 . The bipartite posets are precisely the 3-free posets. 

10 . The bipartite posets are the posets whose comparability graph and cover graph 
are the same. 

11 . Every maximal chain of a finite poset P extends from a minimal element of P 
to a maximal element of P, and successive pairs on a maximal chain satisfy the cover 
relation of P. 

12 . The Cartesian product of two posets is a poset. 

13 . A poset and its dual have the same comparability graph and the same cover graph. 

14 . The Hasse diagram of the dual of a poset P can be obtained from the Hasse diagram 
of P either by reflecting through the horizontal axis or by rotating 180 degrees. 

15 . The set of order-preserving maps from a poset P to a poset Q forms a poset, 
denoted by Q p , under “coordinate-wise ordering”: / < g in Q p if and only if f(x) <q 
g(x) for all x £ P. 


Examples: 

1. The following figure shows: (A) a 3-chain, (B) an antichain of width 4, and (C) a 
bipartite poset. 


O 

o 


(A) 


• • • • 

(B) 



2 . The poset 2 2 3 is not planar, even though it has a planar cover graph. However, 

deleting its minimal element or maximal element leaves a planar subposet. 



3 . The cover graph of the poset 2” is isomorphic to the n-dimensional cube, whose 
vertices are the bitstrings of length n, with bitstrings adjacent if they differ in one 
position. Each bit encodes the possible presence of an element of the set of which B n 
is the Boolean algebra (§5.8.1). 

4 . The interval in the Boolean algebra 2" between an element of rank k and an element 
of rank l > k is isomorphic to the poset 2 l ~ k . 

5 . Every maximal chain in the Boolean algebra 2" has size n + 1 and length n, and 
there are n! such chains. There are maximal antichains as small as 1 element. 
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6. In general, the poset D(n ) is isomorphic to a chain product, one factor for each prime 
divisor of n. The elements of D(n ) can be encoded as integer vectors { ai, . . . , a n | 0 < 
a* < e-i }, where n is a product of distinct primes with powers e\, . . . , e n , and a < b if 
and only if a.j < bi for all i. 

7. The Hasse diagrams for two possible partial orderings on the Cartesian product of 
the domains of two posets is shown in the following figure: 


• ax 



PxQ PxQ, lex 

8. The two posets M 5 = 1 © (1 + 1 + 1) © 1 and N 5 = 1 © (2 + 1) ©1 are used in §11.1.4 
in a forbidden subposet description. 



1 1 .1 .3 RANK, IDEALS, AND FILTERS 
Definitions: 

A graded poset is a poset in which all maximal chains have the same length. 

The rank r(P) of a graded poset P is the length of any maximal chain. 

A rank function r on a poset P is an assignment of integers to the elements so that 
the relation y covers x implies that r(y) = r(x) + 1. 

A ranked poset is a poset having a rank function. 

The kth rank of a ranked poset P is the subset Pk of elements for which r(x) = k. 

The kth rank parameter of a subset of elements F in a ranked poset P is the cardi- 
nality |+ ft Pk\ of the number of elements of F in the fcth rank of P. 

The kth Whitney number Nk(P) of a ranked poset P is the cardinality Pk of the fcth 
rank. 
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The length of a poset P is the length of a longest chain in P, which is one less than 
the height of P. Note : Sometimes “height” is used synonymously with length. 

The Jordan-Dedekind chain condition for a poset is that every interval has finite 
length. 

The width of a poset P , denoted w(P ), is the maximum size of an antichain in P. 

An ideal (or down-set , order ideal, or hereditary family ) in a poset P is a sub- 
poset I such that if x £ I and y < x, then y £ I. 

A filter (or up-set or dual ideal) in a poset P is a subposet F whose domain is the 
set-theoretic complement of the domain of an ideal. 

The ideal generated by an element x in a poset P is the down-set D[x] = { y £ P \ 
y < x }. The related notation D(x) means the down-set {y £ P \ y < x}. 

The ideal generated by a subset A in a poset P is the down-set D[A] = (J x ,- A D [.:/;] . 
The related notation D(A) means the down-set (J x ^ A D{x). 

The hlter generated by an element x in a poset P is the up-set U[x] = { y £ P \ 
y > x } . The related notation U (x) means the up-set {y £ P \ y > x}. 

The filter generated by a subset A in a poset P is the up-set U[A] = (J x(= 4 U[x\. 
The related notation U(A) means the up-set \J X&A U(x). 

A forbidden subposet description of a class of posets is a characterization of that 
class as the class of all posets that does not contain any of the posets in a specified 
collection. (This generalizes the concept of Q-free.) 

Facts: 

1. A graded poset has a rank function, in which the rank of each element is defined to 
be its height. 

2. If posets Pi, P 2 have rank functions ri, r%, then the Cartesian product P = Pi x P 2 
is ranked, so that the element x = (£ 1 , 2 : 2 ) has rank r(x) = r\{x\) + £ 2 ( 2 : 2 ). 

3. In a Cartesian product of finite ranked posets Pi and P 2 , the Whitney numbers for 
the Cartesian product P = Pi x P 2 satisfy the equation Nk{P) = ]© Ai(Pi) Afc-i^)- 

4. The Boolean algebra on a set X of cardinality n is isomorphic to 2", the Cartesian 
product of n copies of 2. This poset isomorphism type is often denoted B n . 

5. The Boolean algebra on a set X of cardinality n is a graded poset, with rank function 
r(S) = |S|, and with Whitney numbers Nk( 2") = (£). 

6. The sequence of Whitney numbers on the Boolean algebra on a set X of cardinal- 
ity n is symmetric, since (?) = ( ",). It is also unimodal, since the sequence rises 
monotonically to the maximum and then falls monotonically. 

7. Sperner’s theorem: The only maximum antichains in the Boolean algebra 2” are 
the middle ranks (one such rank if n is even, two if n is odd). Thus the width of 2” is 

(t „%j)- 

8. The maximal elements of an ideal form an antichain, as do the minimal elements of 
a dual ideal; these yield natural bijections between the set of antichains in a poset P 
and the sets of ideals or dual ideals of P. 

9. The divisibility poset D(I) on the integers satisfies the Jordan-Dedekind chain con- 
dition. 
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Examples: 

1. In the poset P of partitions of {a, b , c, d} under inverse refinement, illustrated below, 
the Whitney numbers are Ni(P) = 1, N 2 {P) = 6, N 3 {P) = 7, and N^P) = 1. 


a-b-c-d 



2. In the poset of partitions of {a, b, c, d} under inverse refinement, the ideal D[ac-bd\ is 
the set {ac-bd, a-c-bd , ac-b-d , a-6-c-d}, and the ideal D(ac-bd) is the set { a-c-bd , ac-b-d, 
a-b-c-d}. 

3. In the poset of partitions of {a, b, c, d} under inverse refinement, the filter U[a-c-bd] 
is the set {ac-bd, abd-c, a-bcd, abed}, and the filter U {a-c-bd) is the set {abd-c, a-bed, 
abed}. 

4. In the graded poset D(n) of divisors of n, the rank r{x) is the sum of the exponents 
in the prime power factorization of x. The Whitney numbers of D{n) are symmetric 
because the divisors x and - have “complementary” ranks. If n is a product of k distinct 
primes, then D{n) = 2 k . 

5. The following Hasse diagram corresponds to an ungraded poset, because the lengths 
of its maximum chains differ, i.e., they are length 2 and length 3. 



6. In the divisibility poset D{I), the subposet D{n) of divisors of n is a finite ideal, and 
the non-multiples of n form an infinite ideal, whose complement is the infinite filter U ( n ) 
of numbers that are divisible by n. 


11.1.4 LATTICES 

Lattices are posets with additional properties that capture some aspects of the inter- 
section and the union of sets and (more generally) of the greatest common divisor and 
least common multiple of positive integers. (See also §5.7.) 

Definitions: 

A ( common ) upper bound for elements x, y in a poset is an element 2 such that 
x < z and y < z. 

A least upper bound (or lub, pronounced “lub”) for elements x, y in a poset is a 
common upper bound 2 such that every other common upper bound z' satisfies the 
inequality 2 < z' . Such an element, if it exists, is denoted x V y. 
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The join of x and y is the lub xV y. 

A ( common ) lower bound for elements x, y in a poset is an element z such that 
x > z and y > z. 

A greatest lower bound (or gib, pronounced “glub”) for elements x, y in a poset is 
a common lower bound z such that every other common lower bound z' satisfies the 
inequality 2 > z' . Such an element, if it exists, is denoted x Ay. 

The meet of x and y is the gib x Ay. 

A lattice is a poset in which every pair of elements has both a lub and a gib. 

A lattice is bounded if it has both a unique minimal element (denoted “0”) and a 
unique maximal element (denoted “1”). 

A nonzero element of a lattice L is join-irreducible (or simply irreducible ) if it 
cannot be expressed as the lub of two other elements. The subposet formed by the 
join-irreducible elements of L is denoted by P(L). 

A nonzero element of a lattice L is meet-irreducible if it cannot be expressed as a 
gib of two other elements. The subposet formed by the meet-irreducible elements of L 
is denoted by Q(L). 

A complement of an element a: of a lattice is an element x such that x V x = 1 and 
x A x = 0. 

A complemented lattice is a lattice in which every element has a complement. 

A lattice isomorphism is an order-preserving bijection from one lattice to another 
that also preserves gibs and lubs. 

An atom of a poset is an element of height 1. 

A lattice is atomic if every element is a lub of atoms (or equivalently, if the atoms are 
the only join-irreducible elements). 

A sublattice of a lattice L is a subposet P such that x Ay and xW y are in P for all x 
and y £ P. 

The divisor lattice is the poset D(n) of the positive integer divisors of n, in which 
x < y means that y is an integer multiple of x. 

The subset lattice is the Boolean algebra 2", that is, the Cartesian product of n copies 
of the standard 2-chain. 

The subspace lattice L. n (q) is the set of subspaces of an n-dimensional vector space 
over a (/-element field, partially ordered by set inclusion. 

The lattice of (order) ideals J(P), for any poset P = ( X , R), is the set of order ideals 
of P, ordered by inclusion. 

The lattice of bounded sequences L(m, n ) has as members the lengtli-m real se- 
quences a\,...,a n such that 0 < aq < - • < a m < n. 

An integer partition is a nonincreasing nonnegative integer sequence having finitely 
many nonzero terms, with trailing zeros added as needed for comparison. 

The Young lattice is the lattice of integer partitions under component-wise ordering. 

A refinement of a set partition <7 replaces each block B £ a by some partition of B. 

The partition lattice is the poset II n of partitions of the set [n] = {1, . . . ,n}, where 
7r < cr if 7r is a refinement of a. 
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Facts: 


1. If z < x in a lattice, then x A (y V z) > (a: A y) V z. 

2. 4-point lemma: If each of the elements z, w is less than or equal to each of the 
elements x, y in a lattice, then z V w < x A y. 

3 . An element z is a least upper bound for x and y if and only if it is a unique minimal 
element among their common upper bounds. 

4 . Every finite lattice is bounded. 

5 . Every chain-product is a lattice. 

6. If a locally finite poset P with a unique maximal element 1 also has a well-defined gib 
operation, then P is a lattice. 


7 . Not all lattices are ranked. In particular, the lattice of integer partitions under 
dominance ordering is unranked. 


8. Every interval in a lattice is a sublattice, but not every sublattice is an interval. 

9. In the subspace lattice L n (q), the Whitney numbers (§11.1.3) satisfy the equation 
NAT full - (<?"— IK.?— 1 — — i) 


10 . In the subspace lattice L n (q), the Whitney number N k (L n (q)) equals the Gaussian 
coefficient [?] g (§2.3.2), which appears in algebraic identities and in analogues of results 
on subsets. 


11 . The Whitney number A^(II„) of partitions of the set [n] into n — k blocks is the 
Stirling subset number { „™ fc } (§2.5.2). This has no closed formula, but the inclusion- 

exclusion principle yields {"} = Si=o( — ■*■)* 


Examples: 

1. The poset specified by the following Hasse diagram is a lattice. 



2 . The poset specified by the following Hasse diagram is not a lattice. Although every 
pair of elements has a common upper bound, none of the three common upper bounds 
for the elements c and d is a least upper bound. 
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3. Two 5-element lattices that occur as subposets of 2 3 but not as sublattices of 2 3 are 
= ] 1 ®(b“t“l 1 Tb)©jL and N§ = X ® (2 -T 1.) © Y 

4. In the divisor lattice D(n), a A b = gcd(a, b ) and a V b = lcm(a, b). 

5. The join-irreducible elements of the divisibility lattice D(I ) are the powers of primes. 

6. In the subset lattice 2 n ,aAb = aCib and a V b = a U b. 

7. The join- irreducible elements of the subset lattice 2 n are the singleton sets. 

8. The subspace lattice L n {q) is a graded lattice, with rank r(U) = dim [7. 

9. In the subspace lattice L n (q), the meet of subspaces U and V is their intersection 
U l~l V, and the join is the unique minimal subspace containing their union. 

10. In the lattice of order ideals J(P ), gib and lub are given by intersection and union. 
Hence J(P ) is a sublattice of the Boolean lattice 2) p \-, equality holds if and only if P is 
an antichain. 

11 . The lattice J(P) of ideals of a poset P = (X,R) is finite, with lj(p) = X and 
Oj(p) = 0- It is graded, with rank function r(I) = |/|. 

12. By the correspondence between ideals of a poset P = (X,R) and their antichains 
of maximal elements, the lattice J(P) of ideals is also a lattice on the antichains of P. 
The corresponding ordering on antichains is A < B if every element of A is less than or 
equal to some element of B. 

13. The lattice L(ni, n) of bounded sequences is a sublattice of n + l m , and L(m, n) = 
Jim x n) = L(n,m). The natural isomorphism maps a sequence a £ L(m.,n) to the 
order ideal of mxn generated by { (m + 1 — i, Oj) | aj > 0 }. 

14. The lattice L(m, n) is a sublattice of the Young lattice. 

15. In the partition lattice Il n , 13|4|2|5 < 123|45; the order of the blocks and the order 
of elements within each block are irrelevant. 

16. The partition lattice n„ is a graded poset, with ln„ = [n] and 0n„ = 1|2| • • • | n. 
The common refinement of 7 r and cr with the fewest blocks is the greatest lower bound 
(meet) of 7r and cr. 

17. The lattice n 3 is isomorphic to the lattice M 5 . 

18. In the ordering on antichains of a poset P defined by A < B if every element of A 
is less than or equal to some element of B, the maximum antichains of P induce a 
sublattice. 


1 1 .1 .5 DISTRIBUTIVE AND MODULAR LATTICES 
Definitions: 

A lattice L is distributive if gib distributes over lub in L, that is, if x A (y V z) — 
(x A y) V (x A z ) for all x, y, z € L. 

A lattice L is modular if x A (y V z) = (x A y) V z for all x,y,z G L such that z < x. 

A lattice L is (upper) semimodular if for all x,y £ L, u x covers x A y" implies “x V y 
covers y" . 

A lattice L is lower semimodular if the reverse implication holds (equivalently, if the 
dual lattice L* is semimodular). 

The height function h of a lattice L is a submodular height function if h(x A y) + 
h(x V y) < h(x) + h(y) for all x,y £ L. 
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A lattice is geometric if it is atomic and semimodular has finite height. 

A closure operator on the subsets of a set E is a function cr. 2 E — > 2 E that maps 
each set to a superset of itself, is order-preserving with respect to set inclusion, and is 
“idempotent” : cr(cr(A)) = a (A). 

The closed subsets of a set E, with respect to a closure operator cr. 2 E — > 2 E , are the 
sets with < 7 (A) = A. 

The Steinitz exchange axiom for a closure operator cr. 2 E — > 2 E is the rule that 
p £ <7 (A) and p G cr(A U q) imply q € a(A U p). 

Facts: 

1. The smallest nondistributive lattices are M 5 = 1®(1 + 1 + 1)®1 and = 
10 (2+1) 01, which are illustrated in §11.1.1, Example 18. 

2 . A lattice is distributive if and only if it occurs as a sublattice of 2" for some n. 

3 . Every sublattice of a distributive lattice is a distributive lattice. 

4 . The product of distributive lattices L \ and i 2 is a distributive lattice, with (aq , x 2 ) A 
( 2 / 1 , 2 / 2 ) = {x\ A yi, £2 A 2 / 2 ) and (aq,^) V ( 221 , 2 / 2 ) = {x\ V y\,x 2 V y 2 ). 

5 . In a lattice L, clistributivity and the dual property that xV (y A z) = (xVy) A (xV z) 
for all x,y, z G L are equivalent. Hence the dual of a distributive lattice is a distributive 
lattice. 

6 . A lattice L is modular if and only if c € [a A b, a] implies a A (bVc) = c for all a, b € L 
(equivalently, if c € [ 6 , b V d] implies b V (c A d) = d for all b,d € L). 

7. Let p a '- L —* L be the operation “take the gib with a”, and let rq,: L — > L be the 
operation “take the lub with 6 ” . A lattice L is modular if and only if for all a,b G L, the 
intervals [a A b, a] and [6, aV6] are isomorphic sublattices of L , with lattice isomorphisms 
given by v h and p a . 

8 . If y covers x in a semimodular lattice L, then for all z G L, x V 2 = y V z or x V 2 is 
covered by y V z. 

9. A lattice L with a lower bound is semimodular if and only if the following is true: 
the height function of L is submodular and in each interval the maximal chains all have 
the same length. 

10 . A lattice is modular if and only if it does not have IV 5 as a sublattice. 

11 . Every distributive lattice is modular, because in a distributive lattice x A (y V z) 
= (x A y) V (a: A z) = (x A y) V z if 2 < x. 

12 . A modular lattice is distributive if and only if it does not have M 5 as a sublattice. 

13 . Given a closure operator, the closed sets form a lattice under inclusion with meet 
and lub given by intersection and closure of the union, respectively. 

14 . If a closure operator cr satisfies the Steinitz exchange axiom, then the lattice of 
closed sets is semimodular. 

15 . The lattice L. n (q) is semimodular. (This follows from the previous fact.) 

16 . A poset is a geometric lattice if and only if it is the lattice of closed sets of a matroid, 
ordered by inclusion (§12.4). (The span operator in a matroid, which adds to X every 
element whose addition to X does not increase the rank, is a closure operator that 
satisfies the Steinitz exchange axiom.) 
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17. A geometric lattice is distributive if and only if it has the form 2 n , and the corre- 
sponding matroid is the free matroid, in which all subsets of the elements are indepen- 
dent. 

18. A complemented distributive lattice is a Boolean algebra. 

Examples: 

1. Among nondistributive lattices, the lattice M 5 is modular, and the lattice N 5 is not 
(which explains the notation). 

2. The subspace lattices L n (q) are not distributive. 

3. The partition lattice II n is semimodular but not modular for n > 3. The lat- 
tice L n (q ) is semimodular. 

4. The partition lattice II n is geometric, and it is the lattice of closed sets of the cycle 
matroid of the complete graph I\ n . 

5. For n > 3, the lattice II(n) is not distributive. 

6. The Boolean lattice 2™, the divisor lattice D(N), the lattice J(P) of order ideals of 
a poset, and the bounded sequence lattice L(m , n) are distributive. 


1 1 .2 POSET PROPERTIES 


1 1 .2.1 POSET PARTITIONS 
Definitions: 

A chain partition of a poset is a partition of the domain of that poset into chains. 

The k-norm of a sequence x = {xi} of real numbers is the sum min{fc, a;*}, whose 
value is commonly denoted mi ; (x). 

The k-norm of a chain partition C of a poset, denoted riif-(C). means the A:- norm 
of the sequence of sizes of the chains in the partition. 

A k- family in a poset P is a subposet containing no chain of size k + 1. The size of a 
maximum fc-family in P is denoted by dj.(P). 

A partition of a poset P into chains is k-saturated if r?Zfc(C) = dk(P)- 

A chain in a ranked poset is symmetric if it has an element of rank r(P) — k whenever 
it has an element of rank k. 

A chain is consecutive if its elements belong to consecutive ranks. 

A symmetric chain decomposition of P is a partition of P into symmetric consec- 
utive chains. 

A symmetric chain order is a poset with a symmetric chain decomposition. 

A graded poset has the Sperner property if some single rank is a maximum antichain. 

A graded poset has the strict Sperner property if all maximum antichains are single 
ranks. 
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A poset P has the k-Sperner property if it has a maximum fc-family consisting of k 
ranks. 

A poset has the strong Sperner property if it is fc-Sperner for all k < r(P). 

A graded poset P satisfies the normalized matching property if for every k and 
every subset A of P k , the set A* of elements in Pfc+i that are comparable to at least 
one element of A satisfies the inequality ^ where N k and N k+ i are Whitney 

numbers. 


A regular covering by chains is a multiset of maximal chains such that for each 
x £ P the fraction of the chains containing x is 


N r 


(x) 


To obtain the bracket representation of a subset S of [n] = {1, . . . , n}, first represent 
the subset S' as a length-?! “parenthesis- vector” , in which the jth bit is a right parenthesis 
if j £ S and a left parenthesis if j /ES. Then wherever possible, recursively, match a left 
parenthesis to the nearest unmatched right parenthesis that is separated from it only 
by previously matched entries. 


An on-line partitioning algorithm processes the elements of a poset as they are “re- 
vealed” . Once an element is assigned to a cell, it remains there; there is no backtracking 
to change earlier decisions. 


Facts: 

1. Dilworth’s theorem: If P is a finite poset, then the width of P equals the minimum 
number of chains needed to cover the elements of P. 

2. Dilworth’s theorem also holds for infinite posets of finite width. 

3. The 1-families are the antichains. 

4. Every fc-family is a union of fc antichains. 

5. A fc-family in P can be transformed into an antichain in P x fc of the same size, and 
vice versa, and hence dk{P) = w(P x fc). 

6. The discussion of saturated partitions is generally restricted to finite posets. 

7. If a k is the number of chains of size at least fc in a fc-saturated chain partition of P, 
then A fc (P) > a k > A k+1 (P), where A fe (P) = d k {P) - <4_i(P) for fc > 1. 

8 . Littlewood-Offord problem: Let A = {a\, . . . ,a n } be a set of vectors in 1Z d , with 
each vector having length at least 1. Let R±, ... ,R k be regions in lZ d of diameter at 
most 1. Then of the 2 n subsets of A = {a^}, the number whose sum lies in (J ; Ri is at 
most d k { 2 n ). 

9. Greene-Kleitman ( GK ) theorem: For every finite poset P and every fc > 0, there 
is a chain partition of P that is both fc-saturated and (fc + l)-saturated. 

10. The GK theorem is best possible, since there are infinitely many posets for which 
no chain partition is both fc-saturated and ^-saturated for any nonconsecutive nontrivial 
values for fc, Z; the smallest has 6 elements (illustration). The GK theorem extends in 
various ways to directed graphs. 

11. Dilworth’s theorem is the special case of the GK theorem for fc = 0 (every chain 
partition is O-saturated) . 

12. Every product of symmetric chain orders is a symmetric chain order. 

13. The lattice of bounded sequences L(m, n) (§11.1.4) has a symmetric chain decom- 
position if min {m, n} < 4. It is not known whether L(m, n) in general has a symmetric 
chain decomposition. 
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14 . The lattice L(m, n) has the Sperner property. 

15 . The partition lattice II(n) fails to satisfy the Sperner property if n is sufficiently 
large. 

16 . The Boolean lattice 2” and the subspace lattice L n (q) satisfy the strict Sperner 
property. 

17 . Every symmetric chain order has the strong Sperner property, and a symmetric 
chain decomposition is fc-saturated for all k. 

18 . The class of graded posets that have the strong Sperner property and a symmetric 
unimodal sequence of Whitney numbers is closed under Cartesian product. 

19 . When < Nk+\, the normalized matching property implies Hall’s condition for 
the existence of a matching saturating l 3 ;. in the bipartite graph of the relations between 
the two levels. 

20 . Two subsets of the Boolean lattice in [n] are on the same chain of the “bracket- 
ing decomposition” if and only if they have the same bracketing representation. This 
provides an explicit symmetric chain decomposition of 2 n . This generalizes for multi- 
sets (D(N)). 

21 . Dedekind’s problem : This is the problem of computing the total number of an- 
tichains in the Boolean algebra 2 n . By using the bracketing decomposition, this number 
is calculated to be at most 3 ^ L»j-/ 2 J ) . Asymptotically, for even n, the number is 

2 (n n / 2 ) e („/ 2 -i) [2“"/ 2 + n 2 2~ n ~ 5 - n2~ n ~ 4 (l + o(l))]. 

The exact values for n <7 are 3, 6, 20, 168, 7581, 7828354, and 2,414,682,040,998, with 
the estimate giving 7996118 for n = 6. 

22. Universal set sequences: A universal set sequence on a set S' is a sequence that 
contains every subset of S as a consecutive subsequence. The bracketing decomposition 
yields a universal set sequence on [n] of length asymptotic to - 2 n . 

23 . If two sets, x and y , are chosen independently according to a probability distri- 
bution on the Boolean lattice 2 n , then the probability that x is contained in y is at 

leaSt ( [n/2j) _ 1 ' 

24 . There is an on-line algorithm that partitions posets of height k into ( fe ^ 1 ) an- 
tichains. This is best possible, even for 2-dimensional posets. 

25 . There is an on-line algorithm that partitions posets of width k into 5 4 ~ 1 chains. 

26 . It is impossible to design an on-line algorithm partitions every poset of width k 
into fewer than ( fc ^ 1 ) chains. 

27 . There is an on-line algorithm to partition every poset of width 2 into 5 chains, and 
this is best possible. 

Examples: 

1 . This is a symmetric chain decomposition of the Boolean lattice 2 3 : 
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2. This is a regular chain covering of the Boolean lattice 2 3 : 



3. The poset 3x4 has a regular covering by six chains, using two chains twice and two 
other chains once each: 



4. The poset 3x4 satisfies the Sperner property but not the strict Sperner property, 
since it has a maximum antichain (size three) that is not confined to a single rank. 


11.2.2 LYM PROPERTY 


Definitions: 


The LYM inequality for a family F in a ranked poset P is the inequality 


where N r ( x \ is a Whitney number. 


xG F 


N Hx) ~ 1} 


A poset P is an LYM order (or satisfies the LYM property) if every antichain F C P 
satisfies the LYM inequality. 


Facts: 

1 . The LYM property was discovered independently for 2" by Lubell, Yamamoto, and 
Meshalkin. 

2. The LYM property, the normalized matching property, and the existence of a regular 
covering by chains are equivalent. 

3. The LYM property implies the Sperner property and also implies the strong Sperner 
property (but not the strict Sperner property). 

4. Every LYM order that has symmetric unimodal Whitney numbers has a symmetric 
chain decomposition. In particular, L n (q) is a symmetric chain order. 

5. It is not known whether every LYM poset has a chain decomposition that is k- 
saturated for all k. 

6. A product of LYM orders may fail the LYM property. 

7. A product of LYM orders whose sequence of Whitney numbers is log-concave is 
an LYM order with a log-concave sequence of Whitney numbers. (A sequence {a„} is 
log-concave if a„ > a n _ia n +i for all n.) 

8. The divisor lattice D(N) is an LYM order, which follows from the previous fact. 

9. The partition lattice n(n) if an LYM order if and only if n < 20. 
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10. The Boolean lattice 2 n and the subspace lattice L n (q) have regular coverings by 
chains and hence are LYM orders. 


11. If {Aa;} is an assignment of real- valued weights to the elements of an LYM poset P, 
then for every subset G C P and every regular covering C of P, 


E 


N. 


r(x) 


< max { J2 El- 


cec 


yeCnG 


Example: 

1. The lattice L(m, n) of bounded sequences is an LYM order if and only if min {to, n\ < 
2 or (to, n) = (3, 3). 


1 1 .2.3 RANKINGS, SEMIORDERS, AND INTERVAL ORDERS 

A chain names the “better” of any pair according to a single scale. Realistically, some 
comparisons may yield indifference. Several families of “chain-like” partial orders suc- 
cessively relax the requirements on indifference. 

Definitions: 

A poset P is a ranking or weak order if its elements are partitioned into ranks 
Pi , . . . , Pfc such that two elements are incomparable if and only if they belong to the 
same rank. 

A poset P is a semiorder if there is a real-valued function f and a fixed real number 
d > 0 (5 may be taken to be 1) such that x < y if and only if f(y) — f(x) > 5. The 
pair (/, 6) is a semiorder representation of the poset P. 

A poset P is an interval order if there is an assignment of real intervals to its members 
such that x < y if and only if the interval for y is totally to the right of the interval 
for x. The collection of intervals is called an interval representation of the poset P. 

A biorder representation on a digraph D is a pair of real-valued functions /, g on 
the vertex set Vd such that u — » v is an arc if and only if f[u) > g(v). 

A Ferrers digraph (or Ferrers relation or biorder) is a digraph having a biorder 
representation. (Also see §2.5.1.) 

Facts: 

1. Rankings model a single criterion of comparison with “ties” allowed, as in voting. 

2. A poset is a ranking if and only if its comparability graph is a complete multipartite 
graph. 

3 . A ranking assigns a score f(z) to each element z such that x < y if and only if 

f{x) < f{y). 

4. The forbidden subposet characterization of a ranking is 1 + 2. 

5 . Semiorders were introduced to model intransitivity of indifference; a difference of a 
few grains of sugar in a coffee cup or a few dollars in the price of a house is not likely to 
affect one’s attitude, but pounds of sugar or thousands of dollars will. The threshold 8 
in a semiorder representation indicates a “just-noticeable difference”. 
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6. A poset is a semiorder if and only if its incomparability graph is a unit interval 
graph, that is, an interval graph (§8.1.3) such that all intervals are of unit length. 

7 . An interval representation of a semiorder P with semiorder representation (/, 5) is 
obtained by setting I x = [ f(x ) — | + e, f(x) + | — e]. 

8. Scott-Suppes theorem: The forbidden subposet characterization of a semiorder is 

{ 1 + 3, 2 + 2 }. 

9 . The number of nonisomorphic semiorders on an n-element set is the Catalan number 

Cn = M 2 :) (§ 3 - L3 )- 

10 . Interval orders model a situation where the value assigned to an element is impre- 
cise. 

11 . The incomparability graph of an interval order is an interval graph. 

12 . Every poset whose incomparability graph is an interval graph is an interval order. 
This follows from the forbidden subposet characterization of interval orders. 

13. The forbidden subposet characterization of an interval order is 2 + 2. 

14 . A poset P is an interval order if and only if both the collections of “upper holdings” 
U(x) = {y £ P | y > x} and “lower holdings” D{x ) = {y £ P \ y < x} form chains 
under inclusion, in which case the number of distinct nonempty upper holding sets and 
distinct nonempty lower holding sets is the same. Construction of these chains yields a 
fast algorithm to compute a representation for an interval order or semiorder. 

15 . The strict comparability digraph of an interval order is a Ferrers digraph, with f(x) 
and g(x) denoting the left and right endpoints of the interval assigned to x. This is 
the strict comparability digraph of a poset because f(x) < g(x) for all x. The “upper 
holdings” and “lower holdings” for an interval order become predecessor and successor 
sets for a Ferrers digraph. 

16 . For a digraph D with adjacency matrix A(D), the following are equivalent: 

• D has a biorder representation (and is a Ferrers digraph); 

• A{D ) has no 2 by 2 submatrix that is a permutation matrix; 

• the successor sets of D are ordered by inclusion; 

• the predecessor sets of D are ordered by inclusion; 

• the rows and columns of A(D) can be permuted independently so that to the left 

of a 1 is a 1. 

17 . The greedy algorithm is an optimal on-line algorithm for partitioning an interval 
order into the minimum number of antichains. It uses at most 2h — 1 antichains to 
recursively partition an interval order of height h, and this is best possible. 

18 . There is an on-line algorithm to partition every interval order of width k into 3fc — 2 
chains, and this is best possible. Equivalently, the maximum number of colors needed 
for on-line coloring of interval graphs with clique size k is 3 k — 2. 

19 . No on-line partitioning algorithm colors all trees with a bounded number of colors. 

20. “Universal” interval orders: Since the ordering of the interval endpoints is all 
that matters, interval representations may be restricted to have integer endpoints. The 
poset /[0, n] or I n denotes the interval order whose interval representation consists of 
all intervals with integer endpoints in {0, . . . ,n}. 

21 . Every finite interval order is a subposet of some /[0,n]. 
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Examples: 

1 . This Hasse diagram represents a poset that is a ranking. Its three ranks are indicated 
by the levels in the diagram. 



2. This Hasse diagram represents a poset that is a semiorder: for instance, with <5=1, 
define /(a) = 2, f(b) = 1.3, /(c) = 0.8, and f(d) = 0. It is not a ranking, by Fact 4, 
because 1 + 2 is a subposet. The interval representation of its incomparability graph is 
at the right. 



3. This Hasse diagram represents a poset that is an interval order. The interval repre- 
sentation of its incomparability graph is at the right. By Fact 8, it is not a semi-order. 


a 

b 

d 


41 
4 ) 
4 > 


c 



a 


4. The skill of a tennis player may vary from day to day, leading to use of an inter- 
val [a X) b x \ to represent player x. In this case player x always beats player y if a x > b y . 

5. The interval order /[0, 3] is not a semiorder. 


1 1 .2.4 APPLICATION TO SOCIAL CHOICE 

When there are more than two candidates for a public office, it is not obvious what 
is the “best” way to select a winner. Any rule has its pluses and minuses, from the 
standpoint of public policy. Social choice theory analyzes the effect of various rules for 
deciding the outcomes of preferential rankings. 

Definitions: 

A protile on a set A of “alternatives” (e.g., candidates for a public office) is a set 
P = {Pi | i £ 1} of linear rankings (ties allowed) of A, one for each member of a set I 
of “individuals” (e.g., voters). 

A consensus function (or social choice function) is a function <f> that assigns to 
each possible profile P = { Pi \ i € I } on a set A of alternatives a linear order (ties 
allowed) of A called the consensus ranking for P. 

A consensus function upholds majority rule provided that it prefers x to y if and only 
if a majority of the individuals prefer x to y. 

Plurality is the consensus function in which the winner(s) is(are) the alternative(s) 
appearing in the greatest number of top ranks, after which the winner(s) is(are) deleted 
and the procedure is repeated to select the next rank of the consensus ranking, etc. 
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The Borda count of an alternative x is the sum, over individual rankings, of the 
number of alternatives x “beats”. The resulting Borda consensus function ranks 
the alternatives by their Borda count. 

Facts: 

1. Plurality can elect some ranked last by a majority. 

2 . Condorcet’s paradox: Some profiles have no decisive consensus (i.e., producing a 
single winner) that upholds majority rule. 

3 . The Borda count is subject to abuse. 

4 . Arrow’s impossibility theorem: No consensus function exists that satisfies the fol- 
lowing four axioms, which were formulated in an attempt to develop a consensus function 
<j> that avoids the difficulties cited in the facts above: 

• monotonicity: If a > b in (f>(P) and if profile P' agrees with profile P except for 

moving alternative a upward in some or all rankings, then a > b in <j)(P'). 

• independence of irrelevant alternatives: If profiles P and P' agree within a set 

A' C A, then 4>(P) and 4>(P r ) have the same restriction to A'. This axiom 
implies that votes for extraneous alternatives do not affect the determination 
of the consensus ranking among the alternatives within the subset A’ . 

• nondegeneracy: Given a, 6 € A, there is a profile P such that a > b in This 

axiom implies that the structure of the outcome is independent of renaming the 
alternatives. 

• nondictatorship: There is no i € I such that a > b in T) implies a > b in cf>(P ). 

Examples: 

1. Suppose that A = {a,b,c} is the set of alternatives, and suppose that the profile 
consists of the three rankings a > b > c, c > a > b, and b > c > a. Then for each 
alternative, there is another alternative that is preferred by | of the population. 

2. The U. S. presidential election of 1912 had three candidates: Wilson (W), Roo- 
sevelt (R), and Taft (T). It is estimated that 45% of the voters ranked W > R > T, 
that 30% ranked R > T > W, and that 25% ranked T > R > W. Wilson won the 
election, garnering a plurality of the popular vote, but a majority of the population 
preferred Roosevelt to Wilson. Moreover, 55% regarded Wilson as the least desirable 
candidate. 

3 . Consider a close election, with four individuals preferring x to y to all other alter- 
natives. A fifth individual prefers y to x. If there are enough other alternatives, the 
fifth individual can throw a Borcla-count election to y by placing x at the bottom. 


1 1 .2.5 LINEAR EXTENSIONS AND DIMENSION 

By adding additional comparison pairs to a partial ordering on a set, ultimately a total 
ordering is obtained. Each of the many ways to do this is called an extension of the 
original partial ordering. 

Definitions: 

An extension of a poset P = (A, R) is a poset Q = (A, S ) such that R C S (i.e., xRy 
implies xSy). 

A linear extension of a poset P is an extension of P that is a chain. 
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A topological sort is an algorithm that accepts a finite poset as input and produces 
a linear extension of that poset as output. 

A topological ordering of an acyclic digraph is a linear extension of the poset arising 
from it. 

The intersection of two partial orderings P = ( A , R) and Q = (A, S) on the same 
set A is the poset (A, R n S) that includes the relations common to both. 

A realizer of a poset P is a set of linear extensions of P whose intersection is P. 

The order dimension (or dimension) of P , written dim(P), is the minimum cardi- 
nality of a realizer of P. 

The standard example S n of an n-dimensional poset is the subposet of 2 n induced 
by the singletons and their complements. 

An alternating k-cycle in a poset P is a sequence of ordered incomparable pairs 
{{xi,yi)}l L 1 such that yt < where subscripts are taken modulo k. 

A critical pair (or unforced pair) in a poset P is an ordered incomparable pair that 
cannot be made comparable by adding any other single incomparable pair as a relation. 

A linear extension L of a poset P puts Y over X, where A and Y are disjoint subposets, 
if y is above x in L whenever (x, y) is an incomparable pair with x £ X, y £ Y. 

Given a subposet Q C P, an upper extension of Q is a linear extension of P that 
puts P — Q over Q. 

Given a subposet Q C P, a, lower extension of Q is a linear extension of P that puts 
P — Q below Q. 

The minimum realizer encoding of a poset lists for each element its position on each 
extension in a minimum realizer. 

The probability space on the set of all linear extensions of a (finite) poset P is 
obtained by taking each linear extension to be equally likely. The notation Pr{x < y) 
denotes the proportion of linear extensions in which element x comes below element y. 

Facts: 

1. Every poset is the intersection of all its linear extensions, from which it follows that 
the concept of dimension is well-defined. 

2 . Given incomparable elements x and y in a poset P, there is a linear extension of P 
in which x appears above y. 

3. The chains are the only posets of dimension 1. 

4. Every antichain has dimension 2, because the intersection of a linear order and its 
dual is an antichain. 

5. Topological sort is used to organize activities with a precedent ordering into a se- 
quential schedule. 

6. The list of minimal forbidden subposets for dimension 2 consists of 10 isolated 
examples and 7 one-parameter families. 

7 . If Q is a subposet of P, then dim(Q) < dim(P). 

8. The dimension of a product of k chains (each of size at least 2) is k. 

9 . A poset has dimension at most k if and only if it imbeds in a product of k chains. 

10 . The dimension of a poset P equals the minimum integer n such that P is a subposet 
of n n . 
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Algorithm 1 : Topological sort. 

input: a finite poset ( X = {aq, . . . , x n }, <) 

output: a compatible total ordering A = x 1 <X 2 < • • • < x n of the elements of X 

for j := 1 to n 

Xj := a minimal element of X 
X:=X-{ Xj } 


11 . The standard example S n is a bipartite poset whose comparability graph is ob- 
tained from the complete bipartite graph K n n by deleting a complete matching. 

12. The minimum realizer encoding of an n-element poset of dimension k takes only 
O(knlogn) bits, instead of the 0(n 2 ) bits of the order relation. Thus, posets of small 
dimension have concise representations. 

13. In the sense of Fact 12, the dimension of a poset may be regarded measure of its 
“space complexity”. 

14. The dimension of a poset P equals the minimum number of linear extensions 
containing all the critical pairs of P. 

15. The dimension of a poset P is equal to the chromatic number of the hypergraph 
whose vertex set is the set of critical pairs and whose edges are the sets of critical pairs 
forming minimal alternating cycles. 

16. The cover graph of the standard example S n (of an ?z-dimensional poset) is AT„ >n - 
( 1-factor). 

17. If X and Y are disjoint subposets of a poset P, then P has a linear extension L 
putting Y over X if and only if P contains no 2 + 2 with minimal elements in Y and 
maximal elements in X. 

18. dim(P) < w(P). A realizer of size w(P) can be formed by taking upper extensions 
of the chains in a partition of P into w(P) chains. 

19. dim(P) < ^ . The standard example S n shows that this bound is the best possible. 

20. One-point removal theorem: For every i£P, dim(P) < 1 + dim(P — x). 

21. For every poset P, there exist four elements {x,y,z,w} such that dim(P) < 2 + 
dim(P— {x, y, z, w}). It is conjectured that, for every poset P, there exist two elements 
{x, y} such that dim(P) < 1 + dim(P — {x, y}). 

22. A poset has dimension 2 if and only if the complement of its comparability graph is 
also a comparability graph; thus there is a polynomial time algorithm to decide whether 
a poset has dimension 2. However, recognizing posets of dimension k is NP-complete 
for every fixed k at least 3. 

23. If P is a finite poset that is not a chain, then P has a pair of elements x,y such 
that 

°.276 4 — \ - 27E — Pr{ - X < y) ^ 5 + 57S ~ °- 7236 - 

24. The |-| conjecture: This conjecture states that there is always a pair of ele- 
ments, x and y, such that | < Pr(x < y) < |. 

25. The traditional name topological sort (Algorithm 1) is commonly used in applica- 
tions. However, a topological sort is not a sort in the standard meaning of that word. 
Nor is it directly related to what mathematicians call topology. 
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Examples: 

1. The following poset has the linear extensions abc and acb and it is the intersection 
of these extensions. Thus its dimension is 2. 



2. The following poset has six linear extensions: abed , , acbd, aedb, cad, and edab. Since 
it is the intersection of abed and edab, its dimension is 2. 

II 

3. The bipartite poset S 3 whose comparability graph and cover graph is the 6-cycle 
1,2, 3, 1,2,3 has dimension 3. The realizer {231123,132213,123312} establishes the 
upper bound. Every realizer must have an extension with 1 below 1, one with 2 below 2, 
and one with 3 below 3. No two of these can occur in the same linear extension, so the 
dimension is at least three. 

4. More generally, for the elements i £ [n] of the standard example S n , a realizer 
must include distinct linear extensions in which the singleton {*} appears above its 
complement, and any n such extensions suffice. 

5. For the standard example S n of dimension n, the critical pairs are {i, 1 }: this reflects 
the fact that, in a realizer, the extensions need to put i above i, for each i. Each pair 
of critical pairs forms a minimal alternating cycle. Viewing the minimal alternating 
cycles as edges creates a hypergraph, namely the complete graph K n , with chromatic 
number n. 

6. Let N be the bipartite poset with minimal elements a and b and maximal elements c 
and d, in which a lies below c, and b lies below c and d. This poset has five linear 
extensions, namely a < b < c < d, a < b < d < c, b < a < c < d, b < a < d < c, and 
b < d < a < c. Thus Pr(a < b) = |. 

7. Application of posets to sorting: The objective of a sort is to arrange the elements 
of a set X into a sequence by posing sequential queries of the form: “is x < y true?”. 
At any time, the state of cumulative knowledge is representable by a poset P = ( X , R), 
such that the linear extensions of P are remaining candidates for the final sequence 
order. A desirable query substantially reduces the number of candidates for extensions 
no matter whether the answer is yes or no, most especially finding a pair, x and y, such 
that Pr{x < y) is close to Thus, Fact 22 shows that the worst case time to sort, in 
the presence of partial information given by a poset P, is fi(logP). 

8. Application of posets to searching: The objective of searching a poset P in which 
item s(x) is stored at location x is to determine whether a target item a is present in P. 
Each step of the search probes a location and compares its value against the target item. 
The worst case requires determining for each x £ P whether the item at location x is 
greater or less than a, so the searching problem is the problem of identifying the downset 
D a = { x £ P | s(x ) < a }. A probe of location x splits the remaining possible downsets 
into those that contain x and those that do not. The former remain as candidates if 
s(ar) < a; the latter remain if s(x) > a. A hypothetical adversary would arrange the 
value s(x ) so that the response would leave the larger portion of the ideals. Thus, the 
number c(P) of probes required in the worst case is at least |"log 2 i(P)], where i(P) 
denotes the number of ideals in poset P. 
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1 1 .2.6 POSETS AND GRAPHS 


From the graph-theoretic viewpoint, a comparability graph is by definition a simple 
graph (§8.6.3) that has a transitive orientation. Comparability graphs are perfect 
graphs, which motivates most study of comparability graphs. 


Definitions: 

A transitive orientation on a simple graph G is an assignment of directions to the 
edges so that whenever there is an xy- arc and a yz- arc, there is also an a;z-arc. 

A quasi-transitive orientation on a simple graph G is an assignment of directions 
to the edges so that whenever there is an xy- arc and a yz- arc, there is also an arc 
between x and 2 . 

A triangular chord for a walk x\, ... , Xk in an undirected graph G is an edge between 
vertices Xi-i and Xi+±, two apart on the walk. 

The auxiliary graph for a simple graph G is the graph G' whose vertices are the 
edges of G, with vertex ei adjacent to vertex e 2 in G' if and only if edges ei and e 2 are 
adjacent in graph G but do not lie on a cycle. 

A module in a graph G is a vertex subset U such that each vertex outside U is adjacent 
to all or none of the vertices in U. 

An order module in a poset P (or autonomous set ) is a set S of elements such that 
every element outside S is above all of S, below all of S, or incomparable to all of S. 

A comparability invariant for posets is an invariant / such that f(P) = f(Q) when- 
ever posets P and Q have the same comparability graph. 

A permutation graph is a graph whose vertices can be placed in 1-1 correspondence 
with the elements of a permutation of [n] = {1, . . . ,n} such that Vi is adjacent to Vj if 
and only if the larger of i and j comes first in the permutation. 

A u,v-bypass in a directed graph is a it, f-path of length at least two such that there 
is also an arc from u to v. v 

A dependent edge in an acyclic directed graph is an arc from u to v such that the 
graph contains a w,i>-bypass. 

Facts: 

1. For a simple graph G, the following are equivalent: 

• G has a transitive orientation; 

• G has a quasi-transitive orientation; 

• every closed odd walk of G has a triangular chord; 

• the auxiliary graph G' is bipartite. 

The implications from top to bottom are straightforward, as is the proof that if the 
auxiliary graph G' is bipartite then G has a quasi-transitive orientation. The proof 
that if G has a quasi-transitive orientation then G has a transitive orientation takes 
more work. The last characterization gives an algorithm to decide whether G is a 
comparability graph in 0(n 3 ) time, where n is the number of vertices. The proof is 
constructive, so a transitive orientation can also be obtained. 
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2 . In any graph, the set of all vertices, the singleton sets of vertices, and the empty set 
are always modules. 

3 . Modules yield a forbidden subgraph characterization of comparability graphs. The 
minimal forbidden induced subgraphs consist of eight infinite families and ten special 
examples. 

4 . If two partial orders have the same comparability graph, then one can be transformed 
into the other by a sequence of moves involving reversing all the relations inside an order 
module S, i.e. , by replacing the partial order induced on S by its dual, and preserving 
all relations between S and its complement. 

5 . Let / be a poset invariant such that f(P) = f(P*) for all posets P, and such that, if 
poset Q is obtained from P by replacing a module in P with another module having the 
same value of /, then f(Q) = f(P)- Then the invariant f is a comparability invariant. 

6. Height, width, dimension, and number of linear extensions are all comparability 
invariants. 

7 . A graph is the complement of a comparability graph if and only if it is the inter- 
section graph of the curves representing a collection of continuous real- valued functions 
on [0, 1]. 

8. The following conditions are equivalent for a graph G: 

• G is a permutation graph (adjacency representing the inversions of a permuta- 

tion) ; 

• G is the comparability graph of a 2-dimensional partial order; 

• G and G are comparability graphs. 

9 . Isomorphism of permutation graphs be tested in 0(n 2 ) time. Some NP-complete 
scheduling problems become polynomial when the poset of precedence constraints is 
2-dimensional. 

10 . A directed graph corresponds to the diagram of some partial order if and only if 
it contains no cycles or bypasses. 

11 . Every graph that is the cover graph of some poset is triangle- free. 

12 . If a graph has chromatic number less than its girth, then it is the cover graph 
of some poset. In particular, a 3-chromatic graph is a cover graph if and only if it is 
triangle- free. 

13 . It is NP-complete to decide whether a 4-chromatic graph is a covering graph. 

14 . The smallest triangle-free graph that is not a cover graph is the 4-chromatic 
Grotzsch graph with 11 vertices. 

15 . The maximum number of dependent edges among the orientations of a graph G is 
equal to the cycle rank (5(G) = \E\ — |V| + 1. 

16 . If a graph G has chromatic number less than its girth (§8.7.2), then for all i such 
that 0 < i < /3(G), the graph has an acyclic orientation with exactly i dependent edges. 

17 . The cover graph of a modular lattice is bipartite. 

18 . A modular lattice is distributive if and only if its cover graph does not contain the 
complete bipartite graph AT 2 j 3 . 

19 . The subgraph of the cover graph of 2 2k+1 induced by the /e-sets and k + 1-sets is 
a vertex-transitive k + 1-regular bipartite graph. The graph is known to contain cycles 
using more than 80% of its vertices. The Erdos revolving door conjecture asserts that 
this graph is Hamiltonian. 
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20. Gallai-Milgram theorem: The vertices of a digraph D can be covered using at 
most a(D) disjoint paths, where a(D) is the maximum size of an independent set in D. 

21. Dilworth’s theorem (§11.2.1) is the special case of the Gallai-Milgram theorem for 
comparability digraphs. 

Examples: 

1. A transitive orientation for a bipartite graph can be obtained by assigning all the 
edge directions from one part to the other, as shown here for K 33 : 



2. An odd cycle of length > 5 has no quasi-transitive orientation (see Fact 1). 

3. Inserting a triangular chord into a 5-cycle permits the resulting graph to have a 
transitive orientation, as shown in the following figure: 



4. The following figure shows a graph and its auxiliary graph: 



5. Any subset of either part of a complete bipartite graph is a module, since the other 
vertices in its part are not adjacent to any vertex in the module, and the vertices in the 
other part of the bipartition are each adjacent to all the vertices in the module. 

6. Deleting a 1-factor from AT n>Il , for n > 3, yields a graph with no module other than 
the complete set of vertices, the singletons, and the empty set. 
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INTRODUCTION 


In broad terms, the study of combinatorial designs is the study of the structure of 
collections of subsets of a finite set when these collections of subsets satisfy certain 
prescribed properties. In particular, a block design has the property that every one of 
these subsets has the same size k and every pair of points in the set is in exactly the 
same number of these subsets. Latin squares are also fundamental in this area and can 
be thought of in this context, but they are commonly thought of as n x n arrays with 
the property that each cell contains one element from an n-set and each row and each 
column contain each element exactly once. Some of the questions of general interest 
include: existence of designs, enumeration of nonisomorphic designs, and the study of 
subdesigns of designs. 

Matroids generalize a variety of combinatorial objects, such as matrices and graphs. 
These structures arise naturally in a variety of combinatorial contexts and provide a 
framework for the study of many problems in combinatorial optimization and graph 
theory. 

Much of the information in §12.1-12.3 is condensed from [CoDi96], which provides 
a comprehensive treatment of combinatorial designs. The main source for material 
in §12.4 is [0x92]. 


GLOSSARY 

affine plane : a set of points and a set of subsets of points (called lines) such that every 
two points lie on exactly one line, if a point does not lie on a line L there is exactly 
one line through the point that does not intersect L, and there are three points that 
are not collinear. 

affine space (of dimension n): the set AG (n,q) of all cosets of subspaces of an n- 
dimensional vector space over a field of order q. 

automorphism (a design D ): an isomorphism from D onto D. 

balanced incomplete block design (BIBD): given a finite set X (of points ), a 
collection of subsets (called blocks) of X of the same size such that every point 
belongs to the same number of blocks, and each pair of points belongs to the same 
number of blocks. The BIBD is described by five parameters: size of X, number of 
blocks, number of blocks to which every element of X belongs, size of each block, 
and number of blocks to which each pair of distinct points belongs. 

basis (for a matroid): a maximal independent set in the matroid. 

basis axioms: a set of axioms that specifies the set of bases of a matroid. 

binary matroid : a matroid that is isomorphic to a vector matroid of a matrix over 
the field GF{ 2). 

biplane: symmetric design in which every pair of distinct points belongs to exactly 
two blocks. 

block: each of the subsets in a design. 
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circuit: a minimal dependent set in a matroid. 

circuit axioms: a set of axioms that specifies the set of circuits of a matroid. 
closed set: in a matroid, a subset of its ground set that is equal to its closure. 
closed under duality: property of a class of matroids that the dual of a matroid in 
the class is also in the class. 

closure (of a subset of the ground set in a matroid): given a subset X of the ground 
set £ in a matroid, the set of all points x £ E such that the rank of X U {a:} is equal 
to the rank of X. 

closure axioms: a set of axioms that specifies the properties that a closure operator 
of a matroid must have. 

closure operation: the mapping K — > B(it'), where K is a set of positive integers 
and B (K) the set of positive integers v for which there exists a (v, Jv)-PBD. 

cobasis (of a matroid): a basis of the dual of a matroid. 

cocircuit (of a matroid): a circuit of the dual of a matroid. 

cographic matroid: a matroid isomorphic to the cocyle matroid of a graph. 

coindependent set (of a matroid): an independent set of the dual of a matroid. 

coloop (of a matroid): a loop of the dual of a matroid. 

combinatorial geometry : a simple matroid. 

complete set of mutually orthogonal latin squares: a set of n - 1 mutually 
orthogonal latin squares of side n. 

conjugate: Let L be an n x n latin square on symbol set £ 3 , with rows indexed by the 
elements of the n-set Ei and columns indexed by the elements of the n-set £ 2 - Let 
T = {(xi, X 2 , X 3 ) | L(x 1 , 2 : 2 ) = £ 3 }- Let {a, 6, c} = {1,2,3}. The (a, b, c)-conjugate 
of L, L^a h f .) , has rows indexed by E a , columns by £b, and symbols by £ c , and is 
defined by £( a ,b,c)( a: a,£b) = x c for each [x\, # 2 , X3) € T. 

connected: property of a matroid that it cannot be written as the direct sum of two 
nonempty matroids. 

cycle matroid (of a graph): the matroid on the edge-set of the graph whose circuits 
are the cycles of the graph. 

t-design: a t-(v, k, A) design. 

t-(v , k, A) design: a pair (X,A) where X is a set of v elements (points), A is a family 
of fc-subsets (blocks) of X , and every t-subset of X occurs in exactly A blocks. 

development (of a difference set D): the incidence structure dev(£) whose points are 
the elements of G and whose blocks are the translates D + g= {d+g\d(zD}. 

dual (of an incidence structure): the incidence structure obtained by interchanging the 
roles of points and lines. 

dual (of a matroid): given a matroid M, the matroid on the same set as M whose 
bases are the complements of the bases of M. 

equivalent (latin squares): Two latin squares L and L' of side n are equivalent if there 
are three bijections, from the rows, columns, and symbols of L to the rows, columns, 
and symbols, respectively, of £', that map L to If . 

Fano plane (or projective plane of order 2): the (7, 7, 3, 3, 1) design with point set 
X = {0, . . . , 6} and the block set A = {013, 124, 235, 346, 450, 561, 602}. 
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flat: closed set. 

t-flat: a subspace of projective dimension t of a projective space; a coset of a subspace 
of affine dimension t of an affine space. 

k-GDD: a group divisible design with A = 1 and K = {/c}. 

graphic matroid: a matroid that is isomorphic to the cycle matroid of some graph. 

ground set: the set of points of a matroid. 

group divisible design (or ( K, X)-GDD) : given an integer A and a set of positive 
integers K , a triple (X, Q, A) where X is a set (of points), Q is a partition of X into 
at least two subsets (called groups), A is a family of subsets of X (called blocks) 
such that: if A in A, then | A| £ K, a group and a block contain at most one common 
point, and every pair of points from distinct groups occurs in exactly A blocks. 

group-type (or type): for a group divisible design, the multiset { |G| : G £ Q }. 

Hadamard design: a symmetric (4n —1,2 n — 1, n — 1) design. 

Hadamard matrix: an nx n matrix H with all entries ±1 that satisfies H T H = nl. 

hyperplane: a subspace of projective dimension ?r — 1 of projective space of projective 
dimension n; a coset of a subspace of affine dimension n— 1 of an affine space of affine 
dimension n; a maximal nonspanning set of a matroid. 

idempotent: property of a latin square (or partial latin square) that for all i, cell ( i , i) 
is occupied by i. 

imbedded latin square: An nxn partial latin square P is imbedded in a latin square 
L if the upper nxn left corner of L agrees with P. 

incidence matrix (of a (v,b,r,k, A) design): the bxv matrix with (i,j)-entry equal 
to 1 if the itli block contains the jth element, and 0 otherwise. 

incidence structure: the structure (V,B,X) consisting of a finite set V of points, a 
finite set B of lines , and an incidence relation X between them. 

independent set: any set in a special collection of subsets of the ground set in a 
matroid. 

index: the number of blocks to which each pair of distinct points in a design belongs. 

isomorphism (of block designs (V,B) and (W,C)): a bijection ip:(V,B) — » {W, C) 
under which ip(B) occurs as a block in C the same number of times that B occurs 
as a block in B. 

isomorphism (of matroids): a bijection between the ground sets of two matroids that 
preserves independence. 

isotopic: equivalent. 

Kirkman schoolgirl problem: the problem of arranging 15 schoolgirls in 5 subsets 
of size 3 for a walk on each of 7 days so that every pair of girls walk together exactly 
once. 

Kirkman triple system: a ( v , 3, 1) resolvable BIBD, together with a resolution of it. 

Kronecker product: for m x p matrix M = ( to^ ) and n x q matrix N = (n©, the 
mn x pq matrix given by 

( TO 11 ./V TO 12 -/V • • • irii p N 

: : : 

m m iiV m m2 N ••• rrimpN 
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latin rectangle : a k x n (k < n) array in which each cell contains a single element 
from an n-set such that each element occurs exactly once in each row and at most 
once in each column. 

latin square: A latin square of side n is an n x n array in which each entry contains 
a single element from a set S of size n such that each element occurs exactly once 
in each row and exactly once in each column. 

line: a subspace of projective dimension 1 of a projective space; a coset of a subspace 
of affine dimension 1 of an affine space. 

loop: in a matroid, element e of the matroid such that {e} is a circuit. 

matroid: an ordered pair M = (E(M),X(M)) where E (the ground set) is a finite set 
and X is a collection of subsets (independent sets) of E such that: the empty set is 
independent; every subset of an independent set is independent; and if X and Y are 
independent and X < Y, then there is an element e in Y — X such that X U {e} is 
independent. 

matroid representable over a Held: a matroid that is isomorphic to the vector 
matroid of some matrix over the field F. 

multiplier (of a difference set D in a group G): an automorphism ip oi G such that 
<p(D) = D + g for some g £ G. 

mutually orthogonal: property of a set of latin squares that every two are orthogonal. 

orthogonal: property of two n x n latin squares A = (a,j) and B = ( b,j ) that all n 2 
ordered pairs (a^, 6© are distinct. 

orthogonal array (of size N with k constraints, s levels, and strength t): a k x N 
array with entries from a set of s > 2 symbols, having the property that in every 
t x N submatrix every f x 1 column vector appears the same number of times. 

pairwise balanced design ( PBD ): for a set K of positive integers, a design (v, K , A) 
consisting of an ordered pair (X, A) where X is a set of size v and A is a collection of 
subsets of X with the property that every pair of elements of X occurs in exactly A 
blocks, and for every block A £ A, \A\ £ K\ a pairwise balanced design is called a 
(v, A')-PBD when A = 1. 

parallel class: a collection of blocks that partition the point set of a design. 

parallel elements: in a matroid, two elements that form a circuit. 

partial latin square: an n x n array with cells, each of which is either empty or else 
contains exactly one symbol, such that no symbol occurs more than once in any row 
or column. 

partial transversal (of length k): in a latin square, a set of k cells, each from a 
different row and each from a different column, such that no two contain the same 
symbol. 

paving matroid: a matroid such that the number of elements in every circuit is at 
least as large as the rank of the matroid. 

PBD-closure: for a set K of positive integers, the set B(AT) = { v | there exists a 

(v.JQ-PBD}. 

planar: property of a matroid that it is isomorphic to the cycle matroid of a planar 
graph. 

plane : a subspace of projective dimension 2 of a projective space; a coset of a subspace 
of affine dimension 2 of an affine space. 
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projective plane : a finite set (of points) and a set of subsets of points (called lines) 
such that every two points lie on exactly one line, every two lines intersect in exactly 
one point, and there are four points with no three collinear; equivalently, a symmetric 
(n 2 + n + 1, n + 1, 1) design. 

projective space (of dimension n): for a field F of order q and an (n+ l)-dimensional 
vector space S over F, the set PG (n,q) of all subspaces of S. 

rank (of a matroid): the rank of the ground set of the matroid. 

rank (of a set in a matroid): the cardinality of every maximal independent subset of 
the set. 

rank axioms: a set of axioms that specifies the properties that a rank function on a 
matroid must have. 

reduced latin square: a latin square such that the elements in the first row and the 
elements in the first column occur in natural order. 

regular matroid: a matroid that is representable over all fields. 

replication number: the number of blocks to which each point in a design belongs. 

representable over a field : property of a matroid that it is isomorphic to a vector 
matroid of some matrix over the field. 

resolution: a partition of the family of blocks of a balanced incomplete block design 
into parallel classes. 

resolvable: the property of a balanced incomplete block design that it has at least one 
resolution. 

simple matroid: a matroid that has no loops or parallel elements. 

simple (f-design): a t-design that contains no repeated blocks. 

spanning set (of a matroid): for a matroid M, a subset of the ground set E of 
rank r(M). 

Steiner triple system: a balanced incomplete block design in which each block has 3 
elements and each pair of points occurs in exactly 1 block; that is, a (v, 3, 1) design. 

subdesign: a collection of points and blocks in a block design that is itself a block 
design. 

subsquare: for k < n, a latin square of side k whose rows and columns are chosen 
from a latin square of side n. 

symmetric block design: a (v, b, r, k, A) design where the number of points (v) equals 
the number of blocks (&). 

ternary matroid: a matroid that is isomorphic to a vector matroid of a matrix 
over GF( 3). 

transversal design: a /c-GDD having k groups of size n and uniform block size k. 

transversal matroid: given a family of sets, the matroid whose independent sets are 
partial transversals of this family. 

traversal: in a latin square of side n, a set of n cells, one from each row and column, 
containing each of the n symbols exactly once. 

type: See group-type. 

uniform matroid: the matroid with 1, 2, . . . , n as ground set, and all subsets of size 
less that a specified number as independent sets. 
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(v, k, A) design: a BIBD with parameters (v, b, r, k, A). 

(v, k, A; n) difference set (of order n= k — A): a /e-subset D of a group G (of order v) 
where every nonzero element of G has exactly A differences d — d! with elements 
from D. 

vector matroid : the matroid on the columns of a matrix whose independent sets are 
the linearly independent sets of columns. 

void design: a BIBD with at most one element. 


12.1 BLOCK DESIGNS 


12.1.1 BALANCED INCOMPLETE BLOCK DESIGNS 


Definitions: 

A balanced incomplete block design (BIBD) with parameters (v,b,r,k, A) is a 
pair (X,A), where A is a set, A is a collection of subsets of X, the five parameters 
are nonnegative integers, either v £ {0, 1} (the void designs) or v > k > 0, and the 
parameters represent the following: 

• v (order): the size of X (elements of X are points , varieties, or treatments ); 

• b ( block number ): the number of elements of A (elements of A are blocks)-, 

• r (replication number ): the number of blocks to which every point belongs; 

• k (block size): the common size of each block; 

• A (index): the number of blocks to which every pair of distinct points belongs. 

Note: A BIBD is often referred to as a design. Different notations are used for balanced 
incomplete block designs: (v,b,r,k, A) BIBD, (v,k, A) BIBD and S\(2,k,v). In this 
chapter (v, k, A) design will be used. See Fact 6. 

A Steiner triple system is a (v, v ^ v ( f 1 \ ©© 3, 1) design, i.e., a BIBD in which each 
block has size 3 and each pair of points occurs in exactly one block. A Steiner triple 
system is denoted STS(u) or S( 2,3, v). (Jakob Steiner, 1796-1863) 

The incidence matrix of a (v, b,r, k, A) design is the bxv matrix A = (a.y) defined by 
_ f 1 if the ith block contains the jth point 
iJ 1 0 otherwise. 


Facts: 

1. Balanced incomplete block designs are used in the design of experiments when the 
total number (v) of objects to be tested is greater than the number (k) that can be 
tested at any one time. They are used to design experiments where the subjects must 
be divided into subsets (blocks) of the same size to receive different treatments, such 
that each subject is tested the same number of times and every pair of subjects appears 
in the same number of subsets. 
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2. Designs are useful in many areas, such as coding theory, cryptography, group testing, 
and tournament scheduling. Detailed coverage of these and other applications of designs 
can be found in Chapter V of [CoDi96]. 

3. The word “balanced” refers to the fact that A remains constant. If A changes 
depending on the pair of points chosen, the design is not balanced. 

4. The word “incomplete” refers to the fact that k < v, that is, the size of each block 
is less than the number of varieties. 

5. Necessary conditions for existence: If there is a (■ v , b, r, k. A) design for particular v, 
b , r, k, and A, then the parameters must satisfy: 

• vr = 

• X(v — 1) = r(k — 1); 

• b > v. (Fisher’s inequality, 1940) (Ronald A. Fisher, 1890-1962) 

6. If a (v,b,r,k, A) design exists, r = X ^I±' > and b = ■ In view of these 

two relationships, (u, 6, r, k, A) designs are commonly referred to simply by the three 
parameters — v, k, A — as a ( v , k, A) design. 

7. Necessary conditions for existence: If there is a ( v , k, A) design for particular v, k, 
and A, then: 

• X(v — 1) = 0 (mod k — 1); 

• Xv(v — 1) = 0 (mod k(k — 1)). 

8. Existence of ( v , k, A) designs: 

• (u,3,A) design: exists for all v satisfying the necessary conditions given in 
Fact 5, namely: 

o if A = 2 or 4 (mod 6) and v = 0 or 1 (mod 3) 

o if A = 1 or 5 (mod 6) and v = 1 or 3 (mod 6) 

o if A = 3 (mod 6) and v = 1 (mod 2) 

o if A = 0 (mod 6) and «/2; 

• (v, 4, A) design: exists for all v and A satisfying the necessary conditions 
given in Fact 5; 

• (u,5,A) design: exists for all v satisfying the necessary conditions given in 
Fact 5 except for the case v = 15, A = 2; 

• (u,6,A) design: exists for all v satisfying the necessary conditions given in 
Fact 5, if A > 1; 

• (v,6, 1) design: exists for all v = 1 or 6 (mod 15), v > 31, v ^ 36, with 56 
possible exceptions, the largest being 2241. (The first few open cases are 46, 

51, 61, 81, and 141.) 

9. Existence of Steiner triple systems: A Steiner triple system with v points exists if 
and only if v = 1 or 3 (mod 6). (Kirkman) 

10. Wilson’s asymptotic existence theorem: Given k and A, there exists a vq ( k, A) such 
that a (v,k, A) design exists for all v > Vo(k,X) that satisfy the necessary conditions 
given in Fact 5 and make b and r integral. It is known that Vq ( k, X) < exp(exp (k k )). 

11 . Assume that {V. B) is a (v,k, A) design. Let B = {V — B \ B & B}. Then (V,B) 
is a (y,v — k, X < ' v ~^’~^~ 1 ' 1 ) design, the complement of (V,B). 


© 2000 by CRC Press LLC 



12 . Given two Steiner triple systems with V\ and i >2 points, respectively, a Steiner triple 
system with ViV 2 points can be constructed as follows: Let an STS(ui) be defined on 
the point set {xi, . . . ,x Vl } and an STS(t> 2 ) be defined on the point set {y\, . . . , y V2 }. 
Define an STS(uii> 2 ) on the point set { Zij | 1 < i < Vi, 1 < j < v 2 } where z mn z pq z rs is 
a block in STSfui't^) if and only if one of the following holds: 

• m = p = r and y n y q y s is a block in STS(v 2 ); 

• n = q = s and x m x p x r is a block in STS(ui); 

• x m x p x r is a block in STS(i>i) and ynVqVs is a block in STS(u 2 ). 

13 . The following table lists different types of block designs and their features. 


name 

block 

size 

size of 
subset 
covered 

# times 
covered 

other 

properties 

balanced incomplete block design 
BIBD § 12.1.1 

k 

2 

A 


pairwise balanced design PBD 
§12.1.6 

various 

2 

A 

also called a lin- 
ear space if A = 1 

Steiner triple system STS §12.1.1, 
§12.1.5 

3 

2 

1 


Kirkman triple system KTS §12.1.4 

3 

2 

1 

resolvable 

resolvable balanced incomplete 
block design RBIBD §12.1.4 

k 

2 

A 

resolvable 

projective plane PG( 2, q) §12.2.3 

q+ 1 

2 

1 

#points = #blocks 
= q 2 + q + 1 

affine plane AG( 2 , q) § 12.2.3 

q 

2 

1 

resolvable 

symmetric design SBIBD §12.2.2 

k 

2 

A 

#points = #blocks 

t-design t-(v, k, A) §12.1.5 

k 

t > 2 

A 


Steiner system S(t,v,k) §12.1.5 

k 

t > 2 

1 



Examples: 

1. The following is a (4, 4, 3, 3, 2) design: X = {a, b, c, d}, blocks {abc, abd , acd , bed}. 

2. Affine plane of order 3 (a (9, 3, 1) design): The point set is X = {0, . . . , 8} and 
the block set is A = {012,345, 678,036,147,258,048,156,237,057,138,246}. Also see 
§12.2.3. This design is known as AG(2,3). This is a Steiner triple system. 

3. Each of the following is a Steiner triple system (a (u, 3,1) design). In each of the 
following a set of base blocks Bj = {bn,b i2 ,b i3 } in the group Z v is given. To get all the 
blocks of the design, take all distinct translates + g = {bn + g , b i2 + g , bi 3 + g }, for 
all g £ Z v , for each of the base blocks -B,;. 

v = 7: {0, 1, 3} (mod 7) [Fano plane] 

v = 15: {0,1,4} {0,2,8} {0,5,10} (mod 15) [The last base block has only 5 (= |) 
distinct translates. This is a short orbit and occurs for all orders v = 3 (mod 6).] 
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v = 19: {0, 1, 4} {0, 2, 9} {0, 5, 11} (mod 19) 

v = 21: {0, 1, 3} {0, 4, 12} {0, 5, 11} {0, 7, 14} (mod 21) 

v = 25: {0, 1, 3} {0, 4, 11} {0, 5, 13} {0, 6, 15} (mod 25) 

v = 27: {0, 1, 3} {0, 4, 11} {0, 5, 15} {0, 6, 14} {0, 9, 18} (mod 27) 

v = 31: {0, 1, 3} {0, 4, 11} {0, 5, 15} {0, 6, 18} {0, 8, 17} (mod 31) 

v = 33: {0, 1, 3} {0, 4, 10} {0, 5, 18} {0, 7, 19} {0, 8, 17} {0, 11, 22} (mod 33) 

v = 37: {0, 1, 3} {0, 4, 9} {0, 6, 21} {0, 7, 18} {0, 8, 25} {0, 10, 24} (mod 37) 

v = 39: {0, 1, 3} {0, 4, 9} {0, 6, 20} {0, 7, 18} {0, 8, 23} {0, 10, 22} {0, 13, 26} (mod 39). 

4. Fano plane or projective plane of order 2, PG(2,2): A (7, 7, 3,3,1) design with 
point set X = {0, ... , 6} and block set A = {013, 124, 235, 346, 450, 561, 602}, shown in 
the following figure. (Often, as here, a block {a, b, c} is written as abc.) Also see §12.2.3. 
(Gino Fano, 1871-1952) 



The incidence matrix of the Fano plane is 


/I 1 0 1 0 0 0\ 

0 110 10 0 
0 0 110 10 
0 0 0 1 1 0 1 

1 0 0 0 1 1 0 

0 1 0 0 0 1 1 

Vl 0 1 0 0 0 1/ 


12.1.2 ISOMORPHISM AND AUTOMORPHISM 
Definitions: 

Two designs (V,B) and ( W,C ) are isomorphic if there is a bijection ip:V —> W under 
which = { ift(x) | x £ B } occurs as a block in C the same number of times that B 
occurs as a block in B. Such a bijection is an isomorphism. 

An automorphism of a design D is an isomorphism from D onto D. 

The automorphism group of a design D is the set of all automorphisms for D with 
composition as the group operation. 

Facts: 

1. Nonisomorphic Steiner triple systems of order v have been enumerated for v < 15. 
Up to isomorphism, there are unique designs of order 3, 7, and 9; there are precisely 
two nonisomorphic designs of order 13, and 80 of order 15. At that point, an explosion 
occurs: the number of nonisomorphic STS(19) exceeds 2,000,000. 
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2 /c 

2. The number of nonisomorphic STS(v) is at least (e~ 5 v) v ' for large v. (Wilson). 

3. Table 1 lists the parameter sets (v, k, A) that satisfy the necessary conditions for 
the existence of a block design, with r < 15 and 3 < k < |. The parameter sets are 
ordered lexicographically across the rows of the table by r, k and A (in this order). 
The column N contains the number of pairwise nonisomorphic (v, k , A) designs or the 
best known lower bound for this number. A “?” indicates that no design with these 
parameters is known to exist, but that existence has not been ruled out. 


12.1.3 SUBDESIGNS 
Definition: 

Let Y be a subset of w points in a (v, k, A) design. If every block of the BIBD contains 

0, 1, or k of the points in Y, then a (w, k, A) design is obtained by taking those blocks 
that contain k points from Y . This BIBD on w points is a subdesign , called a (w, k , A) 
subdesign. 

Facts: 

1. If there is a (v,k, 1) design containing a (w, k, 1) subdesign, then v > (k — 1 )w + 

1. (The parameter lists (v, k, 1) and (w, k, 1) must satisfy the necessary conditions of 
§12.1.1 Fact 6.) 

2. In the cases k = 3 and k = 4, the necessary conditions of §12.1.1 Fact 5 for the 
presence of a subdesign are sufficient. That is, in the case of k = 3, for all v > 2 w + 1, 
with both r,wslor 3 (mod 6), there exists a (v, 3, 1) design that contains a (w, 3, 1) 
subdesign. In the case k = 4, for all v > 3w + 1, with both v, w = 1 or 4 (mod 12) there 
exists a (u,4, 1) design that contains a (w, 4, 1) subdesign. 

Example: 

1. A construction for a Steiner triple system of order 2i;+l given a Steiner triple system 
of order v: A variant of this construction dates back at least to Thomas P. Kirkman 
in 1847. The original STS©) is a subdesign of the resulting STS(2u + 1). 

Let (A, A) be an STS©) with X = {cco, Xi, . . . , x v -i}. For each i = 0, 1 , . . . , v — 1, 
let Fi = { {x + i, —x + i} | x £ Z v , x ^ 0 } U {*, oo}. Then for each i = 0 , , . . . , v — 1, 
construct the triples {a, b, Xi} where {a, 6} € Fi. The set of all such triples in addition to 
the original triples in A is the desired STS(2r;+ 1) on the point set XU {0, 1 , ... , v, oo}. 

For v = 7, the following STS(15) is obtained. The last row of triples is an STS(7). 

{0, oo, xq] {1, 6,2:0} {2, 5, x 0 } {3, 4, 2:0} 

{1,00,2:1} {2, 0,2:1} {3, 6, 2:1} {4, 5, xi} 

{2,00,2:2} {3, 1,2:2} {4, 0,2:2} {5, 6, 2’ 2 } 

{3,00,2:3} {4, 2, 2:3} {5, 1,2:3} {6, 0,2-3} 

{4,00,24} {5,3,24} {6,2,24} {0,1,24} 

{5,00,25} {6,4,25} {0,3,25} {1,2,25} 

{6, OO, 2 6 } {0,5,2 6 } {1,4,2 6 } {2, 3,26} 

{20,21,23} {21,22,24} {22,23,25} {23,24,26} {24,25,20} {2 5 ,2 6 ,2i} {2 6 ,2 0 ,2 2 }. 


© 2000 by CRC Press LLC 


Table 1 (v, b, r, k, A) designs with r < 15. 


V 

k 

A 

N 

V 

k 

A 

N 

V 

k 

A 

N 

7 

3 

1 

1 

9 

3 

1 

1 

13 

4 

1 

1 

6 

3 

2 

1 

16 

4 

1 

1 

21 

5 

1 

1 

11 

5 

2 

1 

13 

3 

1 

2 

7 

3 

2 

4 

10 

4 

2 

3 

25 

5 

1 

1 

31 

6 

1 

1 

16 

6 

2 

3 

15 

3 

1 

80 

8 

4 

3 

4 

15 

5 

2 

0 

36 

6 

1 

0 

43 

7 

1 

0 

22 

7 

2 

0 

15 

7 

3 

5 

9 

3 

2 

36 

25 

4 

1 

18 

13 

4 

2 

2,461 

9 

4 

3 

11 

21 

6 

2 

0 

49 

7 

1 

1 

57 

8 

1 

1 

29 

8 

2 

0 

19 

3 

1 

>l.lxl0 9 

10 

3 

2 

960 

7 

3 

3 

10 

28 

4 

1 

>145 

10 

5 

4 

21 

46 

6 

1 

? 

16 

6 

3 

18,920 

28 

7 

2 

7 

64 

8 

1 

1 

73 

9 

1 

1 

37 

9 

2 

4 

25 

9 

3 

78 

19 

9 

4 

6 

21 

3 

1 

>2xl0 6 

6 

3 

4 

4 

16 

4 

2 

22,859 

41 

5 

1 

>5 

21 

5 

2 

>35 

11 

5 

4 

4,393 

51 

6 

1 

? 

21 

7 

3 

3,809 

36 

8 

2 

0 

81 

9 

1 

7 

91 

10 

1 

4 

46 

10 

2 

0 

31 

10 

3 

151 

12 

3 

2 

>10 6 

12 

4 

3 

>17,172,470 

45 

5 

1 

>16 

12 

6 

5 

116,034 

45 

9 

2 

>11 

100 

10 

1 

0 

111 

11 

1 

0 

56 

11 

2 

>5 

23 

11 

5 

1,102 

25 

3 

1 

>10 14 

13 

3 

2 

>92,714 

9 

3 

3 

22,521 

7 

3 

4 

35 

37 

4 

1 

>3 

19 

4 

2 

>423 

13 

4 

3 

>3,702 

10 

4 

4 

>1,759,613 

25 

5 

2 

>28 

61 

6 

1 

? 

31 

6 

2 

>72 

21 

6 

3 

>1 

16 

6 

4 

>111 

13 

6 

5 

>2,572,156 

22 

8 

4 

? 

33 

9 

3 

>3,375 

55 

10 

2 

0 

121 

11 

1 

>1 

133 

12 

1 

>1 

67 

12 

2 

0 

45 

12 

3 

>3,752 

34 

12 

4 

0 

27 

3 

1 

>10 n 

40 

4 

1 

>10 6 

66 

6 

1 

>1 

14 

7 

6 

>17,896 

27 

9 

4 

>8,071 

40 

10 

3 

7 

66 

11 

2 

>2 

144 

12 

1 

? 

157 

13 

1 

7 

79 

13 

2 

>2 

53 

13 

3 

0 

40 

13 

4 

>389 

27 

13 

6 

208,310 

15 

3 

2 

>685,521 

22 

4 

2 

>7,921 

8 

4 

6 

2,310 

15 

5 

4 

>103 

36 

6 

2 

>5 

15 

6 

5 

>117 

85 

7 

1 

? 

43 

7 

2 

>4 

29 

7 

3 

>1 

22 

7 

4 

>34 

15 

7 

6 

>57,810 

78 

12 

2 

0 

169 

13 

1 

>1 

183 

14 

1 

>1 

92 

14 

2 

0 

31 

3 

1 

>6xl0 16 

16 

3 

2 

>10 13 

11 

3 

3 

>436,800 

7 

3 

5 

109 

6 

3 

6 

6 

16 

4 

3 

>6xl0 13 

61 

5 

1 

>10 

31 

5 

2 

>1 

21 

5 

3 

>10 9 

16 

5 

4 

>11 

13 

5 

5 

>30 

11 

5 

6 

>127 

76 

6 

1 

>1 

26 

6 

3 

>1 

16 

6 

5 

>15 

91 

7 

1 

>2 

16 

8 

7 

>9xl0 7 

21 

9 

6 

>10 4 

136 

10 

1 

7 

46 

10 

3 

7 

28 

10 

5 

>3 

56 

12 

3 

>4 

91 

13 

2 

0 

196 

14 

1 

0 

211 

15 

1 

0 

106 

15 

2 

0 

71 

15 

3 

>8 

43 

15 

5 

0 

36 

15 

6 

>25,634 
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12.1.4 RESOLVABLE DESIGNS 


Definitions: 

A parallel class is a collection of blocks that partition the point set. 

A resolution of a BIBD is a partition of the family of blocks into parallel classes. A 
resolution contains exactly r parallel classes. 

A BIBD is resolvable , denoted RBIBD, if it has at least one resolution. 

A (u,3,l) RBIBD, together with a resolution of it, is a Kirkman triple system , 
written KTS(u). 


Facts: 

1. Necessary conditions for existence of a ( v , k, X) RBIBD are 

• \(v — 1) = 0 (mod ( k — 1)); 

• v = 0 (mod k). 

2. If a ( v , k, A) RBIBD exists, then b > v + r — 1 where b is the number of blocks. 
When b = v + r — 1 (or equivalently, r = k + A) the RBIBD has the property that two 
nonparallel lines intersect in exactly points. (R. C. Bose, 1901-1987) 

3. A KTS(u) exists if and only if v = 3 (mod 6). 

4. The following table summarizes the current state of knowledge concerning the exis- 
tence of resolvable designs. 

For the values of k and A given, the number of parameter sets (i>, fc, A) satisfying 
all necessary conditions for the existence of a resolvable ( v , fc, A) design for which the 
existence of a resolvable ( v , k, A) design is not known is given under the column headed 
“exceptions”. The column headed “ largest possible exception” gives the largest v satis- 
fying the necessary conditions for the existence of a resolvable ( v , k, A) design for which 
a resolvable ( v , k, A) design is not known. 


k 

A 

exceptions 

largest possible 
exception 

3 

1 

none 


3 

2 

6 


4 

1 

none 


4 

3 

none 


5 

1 


4,965 

5 

2 

15 

50,722,390 

5 

4 

10 

195 

6 

5 


3,042 

6 

10 

none 


7 

6 

14 

33,936 

8 

1 


24,480 

8 

7 


2,928 
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Example: 

1 . Kirkman’s schoolgirl problem : In 1850, Kirkman posed the following: fifteen young 
ladies in a school walk out three abreast for seven days in succession; it is required to 
arrange them daily, so that no two walk twice abreast. (Thomas P. Kirkman, 1806-1895) 
This is equivalent to finding a resolution of some (15,3, 1) design (or a KTS(15)). 
The following is a solution to Kirkman’s schoolgirl problem: 


Monday 

Tuesday 

Wednesday 

Thursday 

Friday 

Saturday 

Sunday 

9,10,12 

10,11,13 

11,12,14 

12,13,15 

13,14,9 

14,15,10 

15,9,11 

15,8,1 

9,8,2 

10,8,3 

11,8,4 

12,8,5 

13,8,6 

14,8,7 

13,2,7 

14,3,1 

15,4,2 

9,5,3 

10,6,4 

11,7,5 

12,1,6 

11,3,6 

12,4,7 

13,5,1 

14,6,2 

15,7,3 

9,1,4 

10,2,5 

14,4,5 

15,5,6 

9,6,7 

10,7,1 

11,1,2 

12,2,3 

13,3,4 


12.1.5 f-DESIGNS AND STEINER SYSTEMS 


Definitions: 

A t-(v,k, A) design (also denoted S\(t,k,v ) and written t-design ) is a pair (A, A) 
that satisfies the properties: 

• X is a set of v elements (called points ); 

• A is a family of subsets ( blocks ) of X, each of cardinality k: 

• every t-subset of distinct points occurs in exactly A blocks. 


A f-design is simple if it contains no repeated blocks. 

A Steiner system is a t-(v, k, 1) design. 

A Steiner triple system , denoted STS(c), is a (v, 3, 1) design. (See §12.1.1.) 
A Steiner quadruple system , denoted SQS(v), is a 3-(i>,4, 1) design. 


Facts: 

1. A (v, k, A) design (a BIBD) is a 2-(v, k, A) design. 

2 . If s < t, then a t-(v, k, A) design is also an s-(v, k, p) design, where p = A . 

3. t-(v, k, A) designs exist for all t. A t-(v,t+ 1, (( t + l)!) 2t+1 ) design exists if v > t + 1 
and v = t (mod [(t + l)!] 2t+1 ). (Teirlinck) 

4. If a t-(v, k, A) design exists, where t = 2s is even, then the number of blocks b > Q). 
(This generalizes Fisher’s inequality, §12.1.1, Fact 5.) 

5 . When A = 1, t-designs are known only for t < 5. Construction of a 6-(u, k , 1) design 
remains one of the outstanding open problems in the study of t-designs. 

6. Much less is known about the existence of t-(v, k, A) designs with t > 3 compared 
to BIBDs: 


• For t = 3, several infinite families are known. 

• For every prime power q and d > 2, there exists a 3-(q d + 1, q + 1, 1) design, 
known as an inversive geometry. When d = 2, these designs are known as 
inversive planes. 

• A 3-(u,4, 1) design ( Steiner quadruple system ) exists if and only if v = 2 or 
4 (mod 6). 
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Examples: 

1. The following is a 3-(8, 4, 1) design: 

X = {oo,0,l,2,3,4,5,6} 

-4= {{0,1,3, oo}, {1, 2, 4, oo}, {2, 3, 5, oo}, {3,4,6,oo}, {4,5,0,oo}, 

{5,6,l,oo}, {6, 0, 2, oo}, {2, 4, 5, 6}, {3, 5, 6,0}, {4, 6, 0,1}, 

{5, 0,1, 2}, {6, 1,2, 3}, {0,2, 3, 4}, {1,3, 4, 5}}. 

2 . Simple t-designs ( t = 4, 5): For f = 4 or 5 and v < 30, the only t-(v , k, 1) designs 
known to exist are those having the following parameters: 

4- (H, 5, 1) 5-(12, 6, 1) 4,(23, 7,1) 

5- (24, 8, 1) 4-(27, 6, 1) 5-(28,7, 1). 

3 . Simple t-designs (t = 6): For t = 6 and v < 30, the only t-(v, k, A) designs known 
to exist are those having the following parameters: 

6-(14, 7,4) 6-(20, 9, 112) 6-(22, 7, 8) 6-(30,7,12). 


12.1.6 PAIRWISE BALANCED DESIGNS 
Definitions: 

Given a set K of positive integers and a positive integer A, a pairwise balanced 
design , written (v, K, A)-PBD, is an ordered pair (X,A) where X is a set (of points) 
of size v and A is a collection of subsets ( blocks ) of X such that: 

• every pair of elements of X occurs together in exactly A blocks; 

• for every block A £ A, |A| £ K. 

When A = 1, A can be omitted from the notation and the design is called a ( v , /F)-PBD 

or a finite linear space. 

Given a set I\ of positive integers, let B (AT) denote the set of positive integers v for 
which there exists a ( v , A")-PBD. The mapping K — > B(A') is a closure operation on 
the set of subsets of the positive integers, as it satisfies the properties: 

• AT C B (AT); 

• I<i C AT 2 => B(Ki) C B(AT 2 ); 

• B(B(A')) = B(A'). 

The set B(A') is the closure of the set K. 

If K is any set of positive integers, then K is PBD-closed (or closed) if B(A') = K. 

If K is a closed set, then there exists a finite subset J C K such that K = B( J). This 
set J is a generating set for the PBD-closed set K. 

If J is a generating set for I\ and if s € J is such that J — {s} is also a generating set 
for K, then s is inessential in AT; otherwise s is essential. 

A basis is a generating set consisting of essential elements. 
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Facts: 

1. A (v, k, A) design is a special case of a PBD in which the blocks are only permitted 
to be of one size, k. 

2. Necessary conditions for existence : The existence of a (v, AfyPBD (with v > 0) 
implies: 

• v = 1 (mod a(I\)) 

• v(v — 1) = 0 (mod / 3(K )) 

where a(K) is the greatest common divisor of the integers { fc — 1 \ k £ K} and f3(K) 
is the greatest common divisor of the integers { k(k — 1) | k £ K }. 

3. Asymptotic existence: Given K , there exists a constant Ck such that a (v, A')-PBD 
exists for all v > Ck that satisfy the necessary conditions of Fact 2. The constant Ck 
is, in general, unspecified. In practice, considerable further work is usually required to 
obtain a concrete upper bound on Ck ■ 

Examples: 

1. The following is a (10, {3,4})-PBD: 

{1,2, 3, 4}, {1,5, 6, 7}, {1,8, 9, 10}, {2, 5, 8}, {2, 6, 9}, {2, 7, 10} 

{3, 5, 10}, {3, 6, 8}, {3, 7, 9}, {4, 5, 9}, {4, 6, 10}, {4, 7, 8} 

2. Table 2 lists closures of some subsets of {3, 4, . . . , 8}. From Fact 3, for a given set I\ 
there are only a finite number of values of v (satisfying the necessary conditions) for 
which there does not exist a (v, Jv)-PBD. These exceptional cases are listed in this table 
for some small sets K. Since 7 £ 13(3), it is not necessary to include 7 in the list of sets 
whose closures are given, when 3 is present. Genuine exceptions (values of v satisfying 
the necessary conditions for the existence of a (v, AT)-PBD for which it has been proven 
that no such design can exist) are shown in boldface, while possible exceptions (neither 
existence or nonexistence of a (v, FQ-PBD is known) are shown in normal type. 


12.1.7 GROUP DIVISIBLE DESIGNS AND TRANSVERSAL DESIGNS 
Definitions: 

A group divisible design (or (A, A)-GDD) is a triple (X. Q. A) where X is a set (of 
points ), Cl is a partition of X into at least two subsets (called groups ), A is a family of 
subsets of X (called blocks) such that: 

• if A in A , then \A\ £ K\ 

• a group and a block contain at most one common point; 

• every pair of points from distinct groups occurs in exactly A blocks. 

If A = 1, a (A', A)-GDD is often denoted by A'-GDD. If K = {fc}, a A'-GDD is written 
fc-GDD. 

The group-type (or type) of a GDD is the multiset { |G| | G £ Q }. Usually an “ex- 
ponential notation” is used to describe the type of a GDD: a GDD of type • ■ • t^ k 

is a GDD where there are u,; groups of size tj for 1 < i < k. 

A transversal design TD (k,n) is a /c-GDD of type n k (that is, one having k groups 
of size n and uniform block size k). 

Fact: 

1. The existence of a TD(fc, n) is equivalent to the existence of k — 2 mutually orthogonal 
latin squares of side n. (§12.3.2.) 
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Table 2 Closures of some subsets of {3,4, ...,8} . 


subset K 


necessary 

conditions 


exceptions 


3 

1, 3 mod 6 

3,4 

0, 1 mod 3 

3,5 

1 mod 2 

3,6 

0, 1 mod 3 

3,8 

TV (natural 


numbers) 

3,4,5 

AT 

3,4,6 

0, 1 mod 3 

3,4,8 

AT 

3,5,6 

AT 

3,5,8 

AT 

3,6,8 

AT 

3, 4, 5, 6 

AT 

3, 4, 5, 8 

AT 

3, 4, 6, 8 

AT 

3, 5, 6, 8 

AT 

3, 4, 5, 6, 8 

AT 

4 

1,4 mod 12 

4,5 

0, 1 mod 4 

4,6 

0, 1 mod 3 

4,7 

1 mod 3 

4,8 

0, 1 mod 4 

4,5,6 

AT 

4,5,7 

AT 

4,5,8 

0, 1 mod 4 

4,6,7 

0, 1 mod 3 

4,6,8 

a r 


TV (natural 4, 5, 6, 10, 11, 12, 14, 16, 17, 18, 20, 23, 26, 28, 29, 

numbers) 30, 34, 35, 36, 38 


5, 6, 11, 14, 17 

4, 8, 10, 12, 14, 20, 22 

4, 6, 10, 12, 14, 16, 18, 20, 26, 28, 30, 34 

4, 5, 10, 11, 12, 14, 17, 20, 23 


5, 11, 14, 17 
4, 10, 14, 20 


8, 9, 12 

7, 9, 10, 12, 15, 18, 19, 22, 24, 27, 33, 34, 39, 45, 46, 
51, 55, 75, 87 

10, 19 

5, 9, 12, 17, 20, 21, 24, 33, 41, 44, 45, 48, 53, 60, 65, 
69, 77, 89, 101, 161, 164, 173 

7, 8, 9, 10, 11, 12, 14, 15, 18, 19, 23, 47 

6, 8, 9, 10, 11, 12, 14, 15, 18, 19, 23, 26, 27, 30, 39, 

42, 50, 51, 54, 62, 63, 66, 74, 78 

9, 12 

5, 9, 10, 12, 15, 19, 24, 27, 33, 45, 75, 87 

5, 7, 9, 10, 11, 12, 14, 15, 17, 18, 19, 20, 22, 23, 24, 

26, 27, 33, 34, 35, 39, 41, 47, 50, 51, 53, 55, 59, 62, 
65, 71, 74, 75, 77, 87, 89, 95, 98, 101, 110, 122, 131, 
161, 170, 182, 194, 242, 266, 290 
5, 6, 9, 10, 11, 12, 14, 15, 17, 18, 19, 20, 21, 23, 24, 
26, 27, 30, 33, 35, 38, 39, 41, 42, 44, 45, 47, 48, 51, 
54, 59, 62, 65, 66, 69, 74, 75, 77, 78, 83, 86, 87, 89, 

90, 93, 101, 102, 107, 110, 111, 114, 122, 123, 126, 

129, 131, 135, 138, 143, 146, 150, 158, 159, 161, 162, 
164, 165, 167, 170, 171, 173, 174, 186, 194, 195, 198 
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subset K 

necessary 

conditions 

exceptions 

4, 5, 6 , 7 

A7 

8, 9, 10, 11, 12, 14, 15, 18, 19, 23 

4, 5, 6, 8 

A7 

7, 9, 10, 11, 12, 14, 15, 18, 19, 23, 47 

4, 5, 7 , 8 

A7 

6, 9, 10, 11, 12, 14, 15, 18, 19, 23, 26, 27, 30, 38, 42, 

51, 62, 66 , 74, 78 

4, 6 , 7 , 8 

A7 

5, 9, 10, 11, 12, 14, 15, 17, 18, 19, 20, 23, 24, 26, 

27, 33, 35, 41, 65, 74, 75, 77, 123, 131, 143 

4, 5, 6 , 7 , 8 

A r 

9, 10, 11, 12, 14, 15, 18, 19, 23 

5,6 

0, 1 mod 5 

10, 11, 15, 16, 20, 35, 40, 50, 51, 80 


12.2 SYMMETRIC DESIGNS AND FINITE GEOMETRIES 


12.2.1 FINITE GEOMETRIES 
Definitions: 

A finite incidence structure (V,B,X) consists of a finite set V of points, a finite set B 
of lines, and an incidence relation X between them. (Equivalently, a finite incidence 
structure is a pair {V,B), where B = {{v \ (■ v , b) € X} \ b G B }. In this case, lines are 
sets of points.) 

The dual incidence structure is obtained by interchanging the roles of points and lines. 

Let F be a finite field, and let S be an (n+l)-dimensional vector space over F. The 
set of all subspaces of S is the projective space of projective dimension n over F. 
When F is the Galois field GF(q) (see §5.6.3), the projective space of projective dimen- 
sion n is denoted PG(n, q). 

Subspaces of projective dimensions 0, 1, 2, and n are points, lines, planes, and 
hyperplanes, respectively; in general, subspaces of projective dimension t are t-flats. 
PG t(n,q) denotes the incidence structure of points and t-flats in PG (n,q) (incidence 
is just containment as subspaces). Often, PGi(n, q) is denoted PG(n, g) (taking the 
structure of points and lines as the natural geometry of the underlying space). 

Let S be an n-dimensional vector space over a finite field F. The set of all cosets of 
subspaces of S is the affine space of affine dimension n over F. When F is the 
Galois field GF(q), the affine space of affine dimension n is denoted AG (n,q). 

Cosets of subspaces of (affine) dimension 0, 1, 2, and n—1 of an affine space of affine 
dimension n are points, lines, planes, and hyperplanes, respectively. In general, 
cosets of subspaces of affine dimension t are t-flats. AG t(n,q) denotes the incidence 
structure of points and t-flats in AG(n,q) (incidence is containment). Often, AGi(n, < 7 ) 
is denoted AG(n, q) (taking the structure of points and lines as the natural geometry of 
the underlying space). 

Note: The term Unite geometry often just means finite incidence structure. How- 
ever, incidence structures are often too unstructured to be of much (geometric) interest. 
Hence the term is sometimes reserved to cover only incidence structures satisfying ad- 
ditional axioms such as those given in §12.2.3 Fact 3 for projective planes. 
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Facts: 


1. Projective geometries: For q a prime power and 1 < t < n, PG t(n,q) is a 


/ V +1 -i Q t+1 -i {q n ~ 1 

V 9-1 ’ 9-1 ’ (9’ 


-l)(g"- 2 -l)...(g" 


^l) 


design. 


9 t - 1 -l)(9*- 2 -l)...(9-l) 

2. Affine geometries: For q a prime power and 1 < t < n, AG t (n, q) is a 

,< 7 * 


fo” o* (9 t ‘ 1 -l)(9 rl 2 -l)...(9" * d : 


12.2.2 SYMMETRIC DESIGNS 

Definitions: 

A (v,b,r,k, A) block design is symmetric if the number of points equals the number 
of blocks, that is, v = b. 

A symmetric design with A = 2 is a biplane. The parameters of a biplane are v = 

( 2 ) + l,k = k,\ = 2. 

Facts: 

1. In a symmetric ( v , k, A) design, r = k. 

2 . If a symmetric (v,k, A) design exists, then: 

• if v is even, then k — A is a perfect square; 

• if v is odd, then the Diophantine equation x 2 = (k — A )y 2 + {—l) < ' v ~ l ^ 2 \z 2 

has a solution in integers, not all of which are zero. (Bruck-Ryser-Chowla. 

The theorem is often referred to as BRC.) 

3 . For every positive integer k there is a symmetric (2 fc+2 — 1, 2 fc+1 — 1, 2 fe — 1) block 
design. 

4. If p is prime and k is a positive integer, there is a symmetric (p 2k +p k + l,p k + 1, 1) 
block design. 

5. In a symmetric design any two blocks intersect in exactly A points. 

6. The dual incidence structure obtained by interchanging the roles of points and blocks 
is also a BIBD with the same parameters (hence the term symmetric). 

7. The dual of a symmetric design need not be isomorphic to the original design. 

8. Given a symmetric (v, k, A) design, and a block A of this design, if the points not 
in A are deleted from all blocks which intersect A, the design obtained is the derived 
design. Its parameters are (k. v — 1 ,k— 1, A, A — 1). 

9. Given a symmetric (v, k, A) design, and given a block A of this design, delete the 
block A , and delete all points in A from all other blocks. The resulting design is the 
residual design , and has parameters (v — k,v — 1, k, k — A, A). (See Example 1.) 

10. Any (v — k, v — 1, k, k — A, A) design is a quasi-residual design. 

11 . Any quasi-residual BIBD with A = 1 or 2 is residual (Hall-Connor); but for A = 3, 
there are examples of quasi-residual BIBDs that are not residual. 

12 . If there is a symmetric (v, k. A) design with n = k — A, then 4n — 1 < v < n 2 + n + l. 
When v = 4n — 1 it is a Hadamard design ; when v = n 2 + n + 1 it is a projective plane. 

13 . The only known biplanes have parameters (7,4,2), (11,5,2), (16,6,2), (37,9,2), 
(56,11,2), and (79, 13,2). 
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14 . The only known symmetric designs with A = 3 have parameters (11, 6, 3), (15, 7, 3), 
(25, 9, 3), (31, 10, 3), (45, 12, 3), and (71, 15, 3). 

15 . Although infinitely many symmetric designs with A = 1 are known, there is no 
other value of A for which this is known to be true. 


Example: 


1. In the symmetric (15,7,3) design in the following table, if the block bo is removed 
and if all the points in bo are removed from the blocks b\, ... , bu, the resulting design 
is the residual design. It has parameters (8,4,3) and its blocks are given on the right. 


bo 

0 

1 

2 

3 

4 

5 

6 

bi 

0 

1 

2 

7 

8 

9 

10 

b 2 

0 

1 

2 

11 

12 

13 

14 

bo 

0 

3 

4 

7 

8 

11 

12 

bo 

0 

3 

4 

9 

10 

13 

14 

bs 

0 

5 

6 

7 

8 

13 

14 

h 

0 

5 

6 

9 

10 

11 

12 

b 7 

1 

3 

5 

7 

9 

11 

13 

^8 

1 

3 

6 

7 

10 

12 

14 

bo 

1 

4 

5 

8 

10 

11 

14 

10 

1 

4 

6 

8 

9 

12 

13 

li 

2 

3 

5 

8 

10 

12 

13 

12 

2 

3 

6 

8 

9 

11 

14 

13 

2 

4 

5 

7 

9 

12 

14 

14 

2 

4 

6 

7 

10 

11 

13 


12.2.3 PROJECTIVE AND AFFINE PLANES 
Definitions: 

A projective plane is a finite set of points and a set of subsets of points (called lines ) 
such that: 

• every two points lie on exactly one line; 

• every two lines intersect in exactly one point; 

• there are four points with no three collinear. 

An affine plane is a set of points and a set of subsets of points (called lines) such that: 

• every two points lie on exactly one line; 

• if a point does not lie on a line L, there is exactly one line through the point that 

does not intersect L\ 

• there are three points that are not collinear. 

Facts: 

1. A finite projective plane is a symmetric (n 2 + n + l, n + 1, 1) design, for some positive 
integer n, called the order of the projective plane. The projective plane has n 2 + n + 1 
points and n 2 + n + 1 lines. Each point lies on n + 1 lines and every line contains 
exactly n + 1 points. 

2 . Principle of duality : Given any statement about finite projective planes that is a 
theorem, the dual statement (obtained by interchanging “point” and “line” and inter- 
changing “point lying on a line” with “line passing through a point”) is a theorem. 
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3 . Any symmetric design with A = 1 is a projective plane. 

4 . The existence of a projective plane of order n is equivalent to the existence of a set 
of n— 1 mutually orthogonal latin squares (MOLS) of side n. 

5 . Existence of projective planes: Very little is known about the existence of projective 
planes: 

• There exists a projective plane of order p k whenever p is prime and k is a positive 

integer. (See Fact 10.) 

• There is no projective plane known for any order n that is not a power of a prime. 

The smallest open order is 12. 

• There is no projective plane of order 10 or any n = 6 (mod 8). 

• There are nondesarguesian planes (Fact 9) known for every order q 2 and q 3 when q 

is a prime power. (See Fact 11.) 

• There are four nonisomorphic projective planes of order 9, three of which are 

nondesarguesian . 

• The following table summarizes the known facts about the existence and number 

of projective planes of order n , for 1 < n < 12: 


order 

2 3 4 5 6 7 8 9 10 11 12 

number of projective planes 

11110114 0 >1 ? 


6. The proof by Lam, Thiel, and Swiercz in 1989 that there is no projective plane of 
order 10 involved great amounts of computer power and time. For details, see [CoDi96]. 

7 . The existence of a projective plane of order n is equivalent to the existence of an 
affine plane of order n. 

8. A finite affine plane is a (n 2 ,n, 1) design, for some positive integer n. The affine 
plane has n 2 points and n 2 + n lines. Each point lies on n + 1 lines and every line 
contains exactly n points. The integer n is the order of the affine plane. 

9 . Any affine plane of order n has the property that the lines can be partitioned into 
n + 1 parallel classes each containing n lines and hence is a resolvable block design. 

10 . A direct construction of a projective plane of every order q = p k , when p is prime 
and k a positive integer : Consider the three-dimensional vector space T'j over GF(g). 

This vector space contains = q 2 + q+ l 1-dimensional subspaces (lines through the 
origin (0,0,0)) and an equal number of 2-dimensional subspaces (planes through the 
origin). Now construct an incidence structure where the points are the 1-dimensional 
subspaces, the lines are the 2-dimensional subspaces and a point is on a line if the 
1-dimensional subspace (associated with the point) is contained in the 2-dimensional 
subspace (associated with the line). This structure satisfies the axioms and thus is a 
projective plane (of order q). Projective planes, such as this, coming from finite fields 
via the construction in this example are desarguesian planes. 

11 . A construction of a projective plane of order n from an affine plane of order n: 
To construct the projective plane of order n from the affine plane of order n, use the 
fact that the lines of the affine plane can be partitioned into n + 1 parallel classes each 
containing n lines. To each line in the ith parallel class adjoin the new symbol 00j. Add 
one new line, namely {ooi, 002, • ■ • , oo n +i}. Now each line contains n + 1 points, there 
are n 2 + n - 1-1 total points, and each pair of points is on a unique line — this is the 
projective plane of order n. 
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12 . A construction of an affine plane of order n from a projective plane of order n: 
The affine plane of order n is the residual design of the projective plane of order n. See 
§12.2.2 Fact 9 for the construction. 

Examples: 

1. The Fano plane (§12.1.1 Example 1) is the projective plane of order 2. 

2 . The affine plane of order 2 is given in part (a) of the following figure. The set of 
points is {1,2, 3, 4}. The six lines are: 

{1,2} {3,4} {1,3} {2,4} {1,4} {2,3}. 

The three parallel classes are: 

{{1,2}, {3, 4}} {{1,3}, {2, 4}} {{1,4}, {2, 3}}. 



(a) (b) 


3 . The affine plane of order 3 is given in part (b) of the figure of Example 2. The set 
of points is {1,2,3,..., 9}. The twelve lines (listed in order in four parallel classes of 
three lines each) are: 

{1,2,3}, {4,5,6}, {7,8,9} {1,4,7}, {2,5,8}, {3,6,9} 

{1,5,9}, {6,2,7}, {4,8,3} {3,5,7}, {2,4,9}, {8,6,1}. 

4 . Using the construction in Fact 10 on the affine plane of order 3 (Example 3) yields the 
projective plane of order 3 with thirteen points {1, 2, 3, 4, 5, 6 , 7, 8 , 9, ooi, 002 , 003 , 004 } 
and thirteen lines: 

{l,2,3,ooi} {4,5,6,ooi} {7,8,9,ooi} 

{ 1 , 4 , 7 , 002 } {2,5,8,002} { 3 , 6 , 9 , 002 } 

{1,5,9,003} {6,2,7,003} {4,8,3,003} 

{3,5,7,004} {2,4,9,004} {8,6,1,004} 

{ 001 , , 002 , 003 , 004 }. 
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12.2.4 HADAMARD DESIGNS AND MATRICES 


Definitions: 


A Hadamard matrix H of order n is a square nxn matrix all of whose entries are ±1 
that satisfies the property that H* H = nl where / is the nxn identity matrix and 7T 4 
is the transpose of H. 

If M = ( rriij ) is a m x p matrix and N = ( ) is an n x q matrix, the Kronecker 
product is the mn x pq matrix M x N given by 


MxN = 


/ m\\N 

| 17121 N 


TO12-/V 

TO22-/V 


rriipN \ 

m 2p N \ 


\m m iN m m2 N 


Wimp 


N ) 


A Hadamard design of order 4n is a symmetric (4n — 1, 2n — 1, n — 1) design. The 
dimension of the Hadamard design is n. 


Facts: 

1. A necessary condition for the existence of a Hadamard matrix of order n is that 
n = 1, n = 2, or n = 0 (mod 4). 

2. HH* = nl = H t H. 

3. The rows [columns] of a Hadamard matrix are pairwise orthogonal [when considered 
as vectors of length n], 

4. If a row of a Hadamard matrix is multiplied by —1, the result is a Hadamard 
matrix. Similarly, if a column of a Hadamard matrix is multiplied by —1, the result is 
a Hadamard matrix. 

5. By multiplying rows and columns of a Hadamard matrix by — 1, a Hadamard matrix 
can be obtained where the first row and column consist entirely of +l’s. A Hadamard 
matrix of this type is normalized. 

6. In a normalized Hadamard matrix of order 4n, every row and column (except the 
first) contains +1 and —1 exactly 2 n times each. 

7. Of all n x n matrices with entries from {— 1,+1}, a Hadamard matrix has the 
maximal determinant. | detH) = n n / 2 . 

8. The Kronecker product of two Hadamard matrices is a Hadamard matrix. Thus, if 
there are Hadamard matrices of order m and n, then there is a Hadamard matrix of 
order mn. 

9. If there exist Hadamard matrices of orders 4m, An, 4 p, 4 q, then there exists a 
Hadamard matrix of order IQmnpq. (Craigen, Seberry, and Zhang) 

10. Let q be a positive integer. Then there exists a Hadamard matrix of order 2 s q for 
every s > [2 log 2 (<? — 3)J. (Seberry) 
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11. A Hadamard design of order 4 n exists if and only if a Hadamard matrix of order 4 n 
exists. See the construction in Fact 14. 

12. Hadamard conjecture : The fundamental question concerning Hadamard matrices 
remains the existence question. The Hadamard conjecture is that there exist Hadamard 
matrices of order 4 n for all n > 1. This remains unproved. 

13. Currently the two smallest orders for which the existence of a Hadamard matrix 
is open are 428 and 668. Because of Fact 7 and the existence of a Hadamard matrix of 
order 2, if q is an odd number and there exists a Hadamard matrix of order 2 s q, then 
there exists a Hadamard matrix of order 2 l q for all t > s. 

Thus, to tabulate known existence results, it is only necessary to give the odd 
numbers q and the smallest power s of 2 such that a Hadamard matrix of order 2 s q 
exists. 

The following table is given in this manner. The number q is obtained by adding 
the indices at the top, left, and bottom of the entry t. 



000 

100 

200 

300 

400 

500 

600 

700 

800 

900 

00 

2 

222 

2 

2 

22 

3 

2 

2 

22 

22 

2 

222 

2 

222 

22 

2 

22 

2 

2 

222 

22 

2 

22 

2 

2 

2 

222 

2 

2 

223 

2 

10 

2 

222 

2 

2 

22 

2 

2 

2 

22 

22 

3 

222 

2 

222 

23 

2 

23 

2 

2 

222 

22 

2 

22 

2 

4 

2 

222 

2 

2 

223 

3 

20 

2 

222 

2 

2 

22 

2 

2 

2 

32 

22 

2 

222 

2 

222 

22 

2 

32 

2 

2 

222 

22 

3 

22 

2 

2 

2 

322 

2 

2 

222 

2 

30 

2 

222 

2 

2 

22 

2 

2 

2 

22 

24 

2 

222 

2 

222 

22 

2 

22 

3 

2 

322 

22 

2 

22 

2 

5 

2 

222 

4 

2 

422 

2 

40 

2 

222 

2 

2 

22 

2 

2 

2 

22 

22 

2 

223 

2 

232 

22 

2 

22 

2 

2 

232 

32 

2 

22 

2 

4 

2 

222 

2 

2 

223 

2 

50 

2 

222 

2 

2 

22 

2 

2 

3 

22 

22 

2 

222 

4 

222 

22 

2 

22 

2 

2 

232 

24 

3 

22 

2 

2 

2 

322 

3 

2 

232 

2 

60 

2 

222 

2 

2 

22 

3 

2 

2 

22 

22 

2 

222 

2 
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22 

2 

22 

2 

2 
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23 

2 

22 

2 

2 

2 
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2 

2 
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2 

70 

2 

222 

2 

2 

22 

2 

3 

2 

22 

22 

2 
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2 
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24 

3 

32 

2 

2 

222 

22 

2 

22 

2 

2 

2 

222 

2 

5 

222 

2 

80 

2 

222 

2 

2 

22 

2 

2 

2 

32 

22 

2 

222 

2 

222 

32 

2 

22 

2 

2 

222 

22 

2 

22 

3 

3 

2 

322 

2 

2 

222 

2 

90 

2 

222 

2 

3 

22 

2 

2 

2 

22 

22 

2 

222 

2 

522 

22 

2 

22 

2 

5 

222 

22 

2 

22 

2 

2 

2 

222 

2 

3 

222 

2 


1 

357 

9 

1 

35 

7 

9 

1 

35 

79 

1 

357 

9 

135 

79 

1 

35 

7 

9 

135 

79 

1 

35 

7 

9 

1 

357 

9 

1 

357 

9 


14. A general construction for Hadamard matrices of order q + 1 when q is an odd 
prime power and q = 3 (mod 4) : 

• construct a q x q matrix C = (c© indexed by the elements of the field GF(g) by 

letting 

_ J 1, if i — j is a square in GF (q) 

13 \ — 1, if i — j is not a square in GF(g); 

• construct a Hadamard matrix H of order q + 1 from C by adding a first column 

of all —Is and then a top row of all Is. 

This method is used in Example 1 to construct the Hadamard matrix of order 12. 

15. Constructing Hadamard designs from Hadamard matrices , and vice versa: Assume 

that there exists a Hadamard matrix of order 4n. Let H be a normalized Hadamard 
matrix of this order. Remove the first row and column of H and replace every —1 in 
the resulting matrix by a 0. The final (4 n — 1) x (4 n — 1) matrix can be shown to be 

the incidence matrix of a (4 n — 1, 2 n — 1, n — 1) Hadamard design. 

This process can be reversed to construct a Hadamard matrix from a Hadamard 
design. 
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Example: 

1. The smallest examples of Hadamard matrices are the following: 
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12.2.5 DIFFERENCE SETS 

Note : In this section only difference sets in abelian groups are considered. 

Definitions: 

Let G be an additively written group of order v. A fc-subset D of G is a (y, k. A; n) 
difference set of order n = k — A if every nonzero element of G has exactly A 
representations as a difference d— d! (d,d' £ D). The difference set is abelian, cyclic, 
etc., if the group G has the respective property. 

The development of a difference set D is the incidence structure dev(D) whose points 
are the elements of the group G and whose blocks are the translates D + g = {d + g \ 
d £ D}, g £ G. 

A multiplier of a difference set D in a group G is an automorphism <p of G such that 
(p(D) = D + g for some g £ G. If g> is a multiplier and <p{K) = th for all h £ G, then t 
is a numerical multiplier. 

Facts: 

1. Both the group G itself and G — {g} (for an arbitrary g £ G) are (v,v,v\0) and 
(v, v — 1, v — 2; 1) difference sets. In Table 3 these trivial difference sets are excluded. 

2. The complement of a ( v , k, A; n) difference set is again a difference set with parame- 
ters (v,v — k,v — 2fc + A;n). Therefore only k < | (the case k = | is actually impossible) 
is considered. 
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3. The existence of a (v, k , A; n) difference set is equivalent to the existence of a sym- 
metric ( v , k , A) design T> admitting G as a point regular automorphism group; that is, 
for any two points p and q , there is a unique group element g which maps p to q. The 
design T> is isomorphic with dev(£). 

4. There are many symmetric designs which do not have difference set representations. 

5. Since difference sets can yield symmetric designs, the parameters v, fc, and A must 
satisfy the trivial necessary conditions for the existence of a symmetric design (A(n— 1) = 
k(k — 1)) and must also satisfy the Bruck-Ryser-Chowla condition (§12.2.2, part 2 of 
Fact 2). 

6. If ip is a multiplier of the difference set £, then there is at least one translate D + g 
of D which is fixed by ip. 

7. The multiplier theorem: Let D be an abelian ( v , k, A; n) difference set. If p is a 
prime that satisfies (p. v) = 1, p|n, and p > A, then p is a numerical multiplier. 

8. The multiplier conjecture: Every prime divisor p of n that is relatively prime to v 
is a multiplier of a (v, k, A; n) difference set; that is, the condition p > A in Fact 7 is 
unnecessary. 

Examples: 

1. A (11,5, 2; 3) difference set in the group Z\\ is {1,3, 4, 5, 9}. 

2 . Table 1 lists abelian difference sets of order n < 15. (See Fact 1.) One difference set 
for each abelian group is listed. In general, there will be many more examples. There 
are no other groups or parameters with n < 15 for which the existence of a difference 
set is undecided. 

In the column “group” the decomposition of the group as a product of cyclic sub- 
groups is given. If the group is cyclic, the integers modulo the group order are used to 
describe the difference set. 


1 2.3 LATIN SQUARES AND ORTHOGONAL ARRAYS 


12.3.1 LATIN SQUARES 
Definitions: 

A latin square of side n is an nxn array in which each entry contains a single element 
from an n-set S, such that each element of S occurs exactly once in each row and exactly 
once in each column. 

A latin square of side n (on the set {1, 2, . . . , n} or on the set {0, 1, . . . , n— 1}) is reduced 
or in standard form if in the first row and column the elements occur in increasing 
order. 

Let L be an nx n latin square on symbol set £ 3 , with rows indexed by the elements of the 
n-set Ei and columns indexed by the elements of the n-set £ 2 - Let T = { (x\, aq, £3) | 
L(x 1 , * 2 ) = £3 } and {a, b , c} = {1, 2, 3}. The (a, 5, c)-conjugate of L, L( aj b, c ), has rows 
indexed by £ a , columns by £&, and symbols by £ c , and is defined by L( a ,b, c ) ( x a ,Xb ) = x c 
for each (aq, X 2 , £ 3 ) € T. 
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Table 1 Abelian difference sets of order n<15. 


n 

V 

k 

A 

group 

difference set 

2 

7 

3 

1 

(7) 

12 4 

3 

13 

4 

1 

(13) 

0 13 9 

3 

11 

5 

2 

(11) 

1 3 4 5 9 

4 

21 

5 

1 

(21) 

3 6 7 12 14 

4 

16 

6 

2 

(8)(2) 
(4) 2 
(4)(2) 2 
(2) 4 

(00) (10) (11) (20) (40) (61) 

(00) (01) (10) (12) (20) (23) 

(000) (010) (100) (101) (200) (211) 

(0000) (0010) (1000) (1001) (1100) (1111) 

4 

15 

7 

3 

(3) (5) 

0 1 2 4 5 8 10 

1 

2 

3 

4 

5 

6 

5 

31 

6 

1 

(31) 

1 5 11 24 25 27 

5 

19 

9 

4 

(19) 

1 4 5 6 7 9 11 16 17 

6 

23 

11 

5 

(23) 

1 2 3 4 6 8 9 12 13 16 18 

7 

57 

8 

1 

(57) 

1 6 7 9 19 38 42 49 

7 

27 

13 

6 

(3) 3 

(001) (011) (021) (111) (020) (100) (112) (120) (121) 

(122) (201) (202) (220) 

8 

73 

9 

1 

(73) 

1 2 4 8 16 32 37 55 64 

8 

31 

15 

7 

(31) 

1 2 3 4 6 8 12 15 16 17 23 24 27 29 30 

9 

91 

10 

1 

(91) 

0 1 3 9 27 49 56 61 77 81 

9 

45 

12 

3 

(3) 2 (5) 

(000) (001) (002) (003) (010) (020) (101) (112) (123) 

(201) (213) (222) 

9 

40 

13 

4 

(40) 

1 2 3 5 6 9 14 15 18 20 25 27 35 

9 

36 

15 

6 

(4)(3) 2 

(2) 2 (3) 2 

(010) (Oil) (012) (020) (021) (022) (100) (110) (120) 

(200) (211) (222) (300) (312) (321) 

(0010) (0011) (0012) (0020) (0021) (0022) (0100) (0110) 
(0120) (1000) (1011) (1022) (1100) (1112) (1121) 

9 

35 

17 

8 

(35) 

0 1 3 4 7 9 11 12 13 14 16 17 21 27 28 29 33 

11 

133 

12 

1 

(133) 

1 11 16 40 41 43 52 60 74 78 121 128 

11 

43 

21 

10 

(43) 

1 4 6 9 10 11 13 14 15 16 17 21 23 24 25 31 35 36 38 40 41 

12 

47 

23 

11 

(47) 

1 2 3 4 6 7 8 9 12 14 16 17 18 21 24 25 27 28 32 34 36 37 42 

13 

183 

14 

1 

(183) 

0 2 3 10 26 39 43 61 109 121 130 136 141 155 

15 

59 

29 

14 

(59) 

1 3 4 5 7 9 12 15 16 17 19 20 21 22 25 26 27 28 29 35 36 41 

45 46 48 49 51 53 57 


The transpose of a latin square L, denoted L T , is the latin square which results from L 
when the role of rows and columns are exchanged; that is, L T (i,j ) = 

A latin square L of side n is symmetric if L(i,j) = L(j, i) for all 1 < * < n, 1 < j < n. 
A latin square L of side n is idempotent if L(i , i) = i for all 1 < i < n. 

A transversal in a latin square of side n is a set of n cells, one from each row and 
column, containing each of the n symbols exactly once. 

A partial transversal of length k in a latin square of side n is a set of k cells, each 
from a different row and each from a different column, such that no two contain the 
same symbol. 
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Two latin squares L and V of side n are equivalent (or isotopic ) if there are three 
bijections, from the rows, columns, and symbols of L to the rows, columns, and symbols, 
respectively, of L' , that map L to L' . 

Two latin squares L and L' of side n are main class isotopic if L is isotopic to some 
conjugate of L' . 

Let k < n. If in a latin square L of side n the k 2 cells defined by k rows and k columns 
form a latin square of side k, then the cells are a latin subsquare of L. 

An n by n array L with cells that are either empty or contain exactly one symbol is a 
partial latin square if no symbol occurs more than once in any row or column. 

A partial latin square is symmetric (or commutative ) if whenever cell (i. j) is occu- 
pied by x, cell ( j , i) is also occupied by x, for every 1 < i < n, 1 < j < n. 

A partial latin square is idempotent if cell (i,i) is occupied by i, for all i. 

A latin rectangle is a k x n (k < n) array in which each cell contains a single element 
from an n-set such that each element occurs exactly once in each row and at most once 
in each column. 

An nxn partial latin square P is said to be imbedded in a latin square L if the upper 
n x n left corner of L agrees with P. 


Facts: 

1. The multiplication table of any (multiplicative) group is a latin square. 


2 . For each positive integer k a reduced 
following format: 

12 3 

2 3 4 

3 4 5 

k— 1 k 1 

At 12 


latin square can be constructed using the 

k — 1 k 
k 1 

1 2 

k - 3 k - 2 
k — 2 k — 1 


3 . Every latin square has 1, 2, 3, or 6 distinct conjugates. 

4 . A symmetric latin square of even side can never be idempotent. 

5 . A symmetric idempotent latin square of side n is equivalent to a 1-factorization of 
the complete graph on n + 1 points, K n+l . A latin square of side n is equivalent to a 
1-factorization of the complete graph K n n . 

6. Every idempotent latin square has a transversal (the main diagonal). 

7 . Some latin squares have no transversals. One such class of latin squares is composed 
of the addition tables of Z- 2 n for every n > 1, or in general the addition table of any 
group that has a unique element of order 2. 

8. Every latin square of side n has a partial transversal of length k where 

k > max{n — y/n , n — 15(logn) 2 }. (P. W. Shor) 

9 . A latin square of side n with a proper subsquare of side k exists if and only if 

k< LfJ- 

10. There exists a latin square of side n with no proper subsquares if n / 2 a 3 h or if 
n = 3, 9, 12, 16, 18, 27, 81, 243. 
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11 . A partial latin square of side n with at most n — 1 filled cells can always be 
completed to a latin square of side n. 

12. A k xn (k < n) latin rectangle can always be completed to a latin square of side n. 

13 . Let L be a partial latin square of order n in which cell (i, j) is filled if and only if 
i < r and j < s. Then L can be completed to a latin square of order n if and only if 
N(i) > r + s — n for i = 1, 2, . . . , n, where N(i) denotes the number of elements in L 
that are equal to i. (H. R.yser) 

14 . A partial n x n latin square can be imbedded in a t x t latin square for every t > 2 n. 

15 . An n x n partial symmetric latin square can be imbedded in a t x t symmetric latin 
square for every even t > 2 n. 

16 . The number of distinct latin squares, the number of main classes, and the number 
of equivalence classes of latin squares of side n go to infinity as n — > oo. The number 
of main classes and equivalence classes of latin squares of side n < 8 is given in the 
following table: 


n 

1 

2 

3 

4 

5 

6 

7 

8 

main classes 

1 

1 

1 

2 

2 

12 

147 

283,657 

equivalence classes 

1 

1 

1 

2 

2 

22 

563 

1,676,257 


Examples: 

1. A 4 x 4 latin square on {1, 2, 3, 4}, where 1, 2, 3, 4 represent four brands of tires, gives 
a way to test each brand of tire on each of the four wheel positions on each of four cars: 
the i, j-entry of the latin square is the brand of tire to be tested on wheel position i of 
car j. 

2. A latin square of side 8 on the symbols 0, 1, . . . , 7: 

01234567 

10345672 

23506741 

34071256 

45617023 

56720314 

67452130 

72163405 


3 . A latin square of side 4 and its six conjugates: 


14 2 3 

2 3 14 

4 13 2 

3 2 4 1 
(1, 2, 3)-conjugate 


12 4 3 
4 3 12 

2 13 4 

3 4 2 1 
(2, 1, 3)-conjugate 


12 4 3 

3 4 2 1 
2 13 4 

4 3 12 

(2, 3, 1) -conjugate 


13 4 2 

3 12 4 
2 4 3 1 

4 2 13 
(1, 3, 2) -conjugate 


13 2 4 

2 4 13 
4 2 3 1 

3 14 2 
(3, 2, 1) -conjugate 

13 2 4 

3 14 2 

4 2 3 1 
2 4 13 

(3, 1, 2)-conjugate 
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4. The following gives the main classes of latin squares of sides 4, 5, and 6. No two 
latin squares listed are main class isotopic. 


0 1 2 3 0 1 2 

1 0 3 2 1 0 3 

2 3 0 1 2 3 1 

3 2 1 0 3 2 0 


0 12 3 4 

1 2 3 4 0 

n = 5: 2 3 4 0 1 

3 4 0 1 2 

4 0 12 3 


0 12 3 4 

1 0 3 4 2 

2 3 4 0 1 

3 4 12 0 

4 2 0 1 3 


012345 012345 012345 
103254 103254 103452 
234501 234501 230514 
325410 325410 345021 
450123 450132 451203 
541032 541023 524130 

012345 012345 012345 
103452 103452 103452 
230514 231504 234501 
345021 345120 345210 
451230 450231 450123 
524103 524013 521034 

012345 012345 012345 
103254 103254 103254 
240513 240513 240513 
351402 351402 351420 
425031 425130 435102 
534120 534021 524031 


0 1 2 3 4 5 

1 0 3 2 5 4 

2 4 0 5 3 1 

3 5 4 0 1 2 

4 2 5 1 0 3 

5 3 1 4 2 0 


0 1 2 3 4 5 

1 0 3 4 5 2 

2 3 1 5 0 4 

3 5 4 1 2 0 

4 2 5 0 1 3 

5 4 0 2 3 1 


0 1 2 3 4 5 

1 2 0 4 5 3 

2 0 1 5 3 4 

3 5 4 1 0 2 

4 3 5 2 1 0 

5 4 3 0 2 1 


5. The following are a latin square of side 7 with a subsquare of side 3 (3 x 3 square in 
upper left corner) and a latin square of order 12 with no proper subsquares. 


1 2 3 4 5 6 7 

2 3 1 6 4 7 5 

3 1 2 7 6 5 4 

4 7 5 1 3 2 6 

7 5 6 3 2 4 1 

6 4 7 5 1 3 2 

5 6 4 2 7 1 3 


1 2 3 4 5 6 

2 3 4 5 6 1 

3 1 5 2 7 8 

4 5 6 7 1 9 

5 6 2 8 a 7 

6 c 8 1 3 a 

7 8 1 a c b 

8 9 b 3 4 c 

9 b 7 c 2 5 

a 7 c b 9 4 

b 4 a 9 8 3 

c a 9 6 b 2 


7 8 9 a b c 

8 9 a b c 7 

4 a 6 c 9 b 

b c 8 3 2 a 

9 b c 4 1 3 

2 7 b 9 4 5 

5 4 2 6 3 9 

a 6 5 1 7 2 

1 3 4 8 a 6 

6 1 3 2 5 8 

c 2 7 5 6 1 

3 5 1 7 8 4 
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1 

6. The partial latin square • 

4 


3 

1 


1 ,s imbedded in the latin eqnare \\ \ \ 

3 2 4 1 


12.3.2 MUTUALLY ORTHOGONAL LATIN SQUARES 
Definitions: 

Two latin squares A = ( ) and B = ( b,j ) of order n are orthogonal if the n 2 ordered 
pairs ( a,ij,bij ) (1 <i,j< n) are distinct. (The relation of orthogonality is symmetric.) 

A set of latin squares {A \, . . . , A / i; } is a set of mutually orthogonal latin squares 
(MOLS) if Ai and Aj are orthogonal for all i,j £ {1, ...,fc} (i ^ j). The maximum 
number of MOLS of order n is written N(n). It is customary to define 7V(0) = N(l) = oo. 

A set of n — 1 MOLS of side n is a complete set of MOLS. 

Facts: 

1. If n > 2, then N(n ) < n— 1. 

2. If n is a prime power, then N(n) = n— 1. 

3. N(n) > 2 for all n > 3, except n = 6. (Bose-Parker-Shriklrande) 

4. N(n) > 3 for all n > 4 except for n = 6 and possibly n = 10. The following table 
gives the best known lower bounds for N(n) for 0 < n < 499. Add the row and column 
indices to obtain the order. 



0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

0 

00 

00 

1 

2 

3 

4 

1 

6 

7 

8 

2 

10 

5 

12 

3 

4 

15 

16 

3 

18 

20 

4 

5 

3 

22 

5 

24 

4 

26 

5 

28 

4 

30 

31 

5 

4 

5 

5 

36 

4 

4 

40 

7 

40 

5 

42 

5 

6 

4 

46 

6 

48 

6 

5 

5 

52 

5 

5 

7 

7 

5 

58 

60 

4 

60 

4 

6 

63 

7 

5 

66 

5 

6 

6 

70 

7 

72 

5 

5 

6 

6 

6 

78 

80 

9 

80 

8 

82 

6 

6 

6 

6 

7 

88 

6 

7 

6 

6 

6 

6 

7 

96 

6 

8 

100 

8 

100 

6 

102 

7 

7 

6 

106 

6 

108 

6 

6 

13 

112 

6 

7 

6 

8 

6 

6 

120 

7 

120 

6 

6 

6 

124 

6 

126 

127 

7 

6 

130 

6 

7 

6 

7 

7 

136 

6 

138 

140 

6 

7 

6 

10 

10 

7 

6 

7 

6 

148 

6 

150 

7 

8 

8 

7 

6 

156 

7 

6 

160 

9 

7 

6 

162 

6 

7 

6 

166 

7 

168 

6 

8 

6 

172 

6 

6 

14 

9 

6 

178 

180 

6 

180 

6 

6 

7 

8 

6 

10 

6 

8 

6 

190 

7 

192 

6 

7 

6 

196 

6 

198 

200 

7 

8 

6 

7 

6 

8 

6 

8 

14 

11 

10 

210 

6 

7 

6 

7 

7 

8 

6 

10 

220 

6 

12 

6 

222 

13 

8 

6 

226 

6 

228 

6 

7 

7 

232 

6 

7 

6 

7 

6 

238 

240 

7 

240 

6 

242 

6 

7 

6 

12 

7 

7 

6 

250 

6 

12 

9 

7 

255 

256 

6 

12 

260 

6 

8 

8 

262 

7 

8 

6 

10 

6 

268 

6 

270 

15 

16 

6 

13 

10 

276 

6 

9 

280 

7 

280 

6 

282 

6 

12 

6 

7 

15 

288 

6 

6 

6 

292 

6 

6 

7 

10 

10 

12 

300 

6 

7 

6 

6 

15 

15 

6 

306 

6 

7 

6 

310 

7 

312 

6 

10 

7 

316 

6 

10 

320 

15 

15 

6 

16 

6 

12 

6 

7 

7 

9 

6 

330 

6 

8 

6 

6 

8 

336 

6 

7 

340 

6 

10 

10 

342 

7 

7 

6 

346 

6 

348 

8 

12 

18 

352 

6 

9 

6 

9 

6 

358 

360 

8 

360 

6 

7 

6 

10 

6 

366 

15 

15 

6 

15 

6 

372 

6 

15 

7 

13 

6 

378 

380 

6 

12 

6 

382 

15 

15 

6 

15 

6 

388 

6 

16 

7 

8 

6 

7 

6 

396 

6 

7 

400 

15 

400 

7 

15 

11 

7 

6 

15 

8 

408 

6 

13 

8 

12 

10 

9 

18 

15 

6 

418 

420 

6 

420 

6 

15 

7 

16 

6 

7 

6 

10 

6 

430 

15 

432 

6 

15 

6 

18 

6 

438 

440 

7 

15 

6 

442 

6 

13 

6 

11 

15 

448 

6 

15 

6 

7 

6 

15 

7 

456 

6 

16 

460 

6 

460 

6 

462 

15 

15 

6 

466 

6 

7 

6 

15 

7 

15 

10 

18 

6 

15 

6 

478 

480 

15 

15 

6 

15 

6 

7 

6 

486 

7 

15 

6 

490 

6 

16 

6 

7 

15 

15 

6 

498 
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5. N(n) — > oo as n — > oo. (Chowla, Erdos, Straus) 

6. N(n x to) > min{ JV(n), IV (m)}. (MacNeish) 

7. The existence of n— 1 MOLS of order n is equivalent to the existence of a projective 
plane of order n (an (n 2 + n + l,n + 1,1) design) and an affine plane of order n (an 
(n 2 ,n, 1) design). (§12.2.3) 

8. The existence of a set of k — 2 mutually orthogonal latin squares of order n is 
equivalent to the existence of a transversal design TD(fc,n). (§12.1.7) 

9. A set of k — 2 MOLS of order n is equivalent to an OA(n, k ) (§12.3.3). 

10. Constructing a complete set of MOLS of order q for q a prime power : A complete 
set of MOLS of order q for q a prime power can be constructed as follows: 

• for each a £ GF(q) — {0}, define the latin square L a (i,j) = i + aj , where 
i,j € GF(q) and the algebra is performed in GF(q). 

The set of latin squares { L a \ a £ GF{q) — {0} } is a set of q — 1 MOLS of side q. 

11 . Let n k be the largest order for which the existence of k MOLS is unknown. So if 
n > rife, then there exist at least k MOLS of order n. See the following table: 


k 

rife 

k 

n k 

k 

n k 

k 

rife 

k 

rife 

2 

6 

5 

62 

8 

2,774 

11 

7,222 

14 

7,874 

3 

10 

6 

75 

9 

3,678 

12 

7,286 

15 

8,360 

4 

42 

7 

780 

10 

5,804 

13 

7,288 




12. Constructing a set of r MOLS of size mn x mn from a set of r MOLS of size 
to x m and a set of r MOLS of size n x n: Let Ai, , A r and Bi, ... ,B r be two sets 
of MOLS, where each A; = (aiy) is of size to x to and each Bi = ( bxy ) is of size n x n. 
Construct a set C \, . . . , C r of mn x mn MOLS as follows: for each k = 1, . . . , r, let 


where 


C k 


^11 ^12 

7~)(0 n (0 

-^21 u 22 


D 

D 


(k) 

Ira 

(■ k ) 


\ 


\ r>( fe ) r>( fe ) / 

x ^ral ±J m2 ' ' ' ^mm / 



((«SAn> 


4. k l) •• 


D {k) = 

13 

(4‘A?) 


.6 S } ) •• 

• (4‘AS) 




*%) •• 

• (og’.i-S )/ 


Note: 

1. In 1782 Leonhard Euler considered the following problem: 

A very curious question, which has exercised for some time the ingenuity of 
many people, has involved me in the following studies, which seem to open a 
new field of analysis, in particular in the study of combinations. The question 
revolves around arranging 36 officers to be drawn from 6 different ranks and 
at the same time from 6 different regiments so that they are also arranged in 
a square so that in each line (both horizontal and vertical) there are 6 officers 
of different ranks and different regiments. 


© 2000 by CRC Press LLC 




A solution to Euler’s problem would be equivalent to a pair of orthogonal latin squares 
of order 6, the symbol set of the first consisting of the 6 ranks and the symbol set of 
the second consisting of the 6 regiments. Euler convinced himself that his problem was 
incapable of solution and goes even further: 

I have examined a very great number of tables . . . and I do not hesitate to 
conclude that one cannot produce an orthogonal pair of order 6 and that the 
same impossibility extends to 10, 14, . . . and in general to all the orders which 
are unevenly even. 


Euler was proven correct in his claim that an orthogonal pair of order 6 does not exist 
[G. Tarry, 1900]; however in 1960 Euler was shown to be wrong for all orders greater 
than 6. (See the Bose-Parker-Shrikhande theorem, Fact 3.) 


Examples: 

1. Two mutually orthogonal latin squares of side 3: 

12 3 12 3 

2 3 1 3 12 

3 1 2 2 3 1 

2. Three mutually orthogonal latin squares of side 4: 


1 2 3 4 1 2 

4 3 2 1 3 4 

2 1 4 3 4 3 

3 4 12 2 

3. Two MOLS of order 10: 

0417298365 
8152739406 
9826374510 
5983047621 
7698415032 
6709852143 
3071986254 
1234560789 
2345601897 
4560123978 


3 4 1 2 3 4 
12 2 14 3 
2 1 3 4 12 

4 3 4 3 2 1 


0786935412 

6178094523 

5027819634 

9613782045 

3902478156 

8491357260 

7859246301 

4560123789 

1234560978 

2345601897 


12.3.3 ORTHOGONAL ARRAYS 
Definition: 

An orthogonal array of size N, with k constraints (or of degree k), s levels (or of 
order s ), and strength t, denoted OA (N,k,s,t), is a k x N array with entries from a 
set of s > 2 symbols, having the property that in every t x N submatrix, every i x 1 
column vector appears the same number A = ^ of times. The parameter A is the index 
of the orthogonal array. 

Note: An OA (IV, k, s, t) is also denoted by OA \(t, k, s ); in this notation, if t is omitted 
it is understood to be 2, and if A is omitted it is understood to be 1. 
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Facts: 


1. An OA x(k,v) is equivalent to a transversal design TDa(A’,v). 

2 . OA^f, k, s ) are known as MDS codes in coding theory. 

3 . An OA \(k,n) exists only if k < 1 (Bose-Bush bound). Generally one is 

interested in finding the largest k for which there exists an OA \(k,n) (for a given A 
and n). 

The following table gives the best known upper bounds and lower bounds for the 
largest k for which there exists a OA Entries for which the upper and lower 
bounds match are shown in boldface. 


A\ n 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

2 

3 

7 

11 

15 

19 

23 

27 

31 

35 

39 

43 

47 

51 

55 

59 

63 

67 

71 

3 

4 

7 

13 

16 

22 

25 

31 

34 

40 

43 

49 

52 

58 

61 

67 

70 

76 

79 





13 

10 


13 

25 


31 

13 

49 

13 

25 

37 

49 

25 


4 

5 

9 

14 

21 

25 

30 

37 

41 

46 

53 

57 

62 

69 

73 

78 

85 

89 

94 




13 


10 

13 

13 


37 

21 

13 

61 

21 

57 

21 


37 

37 

5 

6 

11 

17 

23 

31 

36 

42 

48 

56 

61 

67 

73 

81 

86 

92 

98 

106 

111 




8 

21 


16 

18 

21 

21 


18 

26 

21 

21 

43 

81 

36 

91 

6 

3 

13 

20 

27 

34 

43 

49 

56 

63 

70 

79 

85 

92 

99 

106 

115 

121 

128 



7 

8 

8 

8 

12 

9 

12 

11 

12 

8 

19 

9 

17 

13 

23 

11 

18 

7 

8 

15 

23 

30 

38 

46 

57 

64 

72 

79 

87 

95 

106 

113 

121 

128 

136 

144 




9 

29 

12 

16 


29 

29 

16 

29 

37 

29 


36 

38 

29 

64 

8 

9 

17 

26 

34 

43 

52 

61 

73 

81 

90 

98 

107 

116 

125 

137 

145 

154 

162 




9 

33 

9 

17 

57 


22 

41 

33 

33 

22 

57 

57 


41 

73 

9 

10 

19 

29 

38 

48 

58 

68 

78 

91 

100 

110 

119 

129 

139 

149 

159 

172 

181 




28 

37 

19 

55 

28 

73 


37 

28 

109 

37 

55 

55 

73 

73 


10 

9 

21 

32 

42 

53 

64 

75 

86 

97 

111 

121 

132 

142 

153 

164 

175 

186 

197 


4 

10 

12 

10 

10 

20 

12 

11 

12 

12 

11 

28 

12 

12 

12 

19 

12 

30 


12.4 MATROIDS 


Linearly independent sets of columns in a matrix and acyclic sets of edges in a graph 
share many similar properties. Hassler Whitney (1907-1989) aimed to capture these 
similarities when he defined matroids in 1935. These structures arise naturally in a 
variety of combinatorial contexts. Moreover, they are precisely the hereditary families 
of sets for which a greedy strategy produces an optimal set. 
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12.4.1 BASIC DEFINITIONS AND EXAMPLES 


Definitions: 

A matroid M (also written ( E,X ) or (E(M),X(M))) is a finite set E (the ground 
set of M) and a collection X of subsets of E ( independent sets) such that: 

• the empty set is independent; 

• every subset of an independent set is independent ( I is hereditary ); 

• if A and Y are independent and A < |Y|, then there is e £ Y — X such that 

IU{e} is independent. 

Subsets of E that are not in X are dependent. 

A basis of a matroid is a maximal independent set. The collection of bases of M is 
denoted B(M). 

A circuit of a matroid is a minimal dependent set. The collection of circuits of M is 
denoted C(M). 

Matroids Mi and M 2 are isomorphic (Mi = M 2 ) if there is a one-to-one function ip 
from E(Mi) onto E(M 2 ) that preserves independence; that is, a subset X of E(M\) is 
in X(M\) if and only if tp(X) is in X{M< 2 ). 

For a matroid M with ground set E and ACE , all maximal independent subsets of A 
have the same cardinality, called the rank of A, written r(A) or tm(A). The rank r(M) 
of M is r(E). 

A spanning set of a matroid M is a subset of the ground set E of rank r(M). 

A hyperplane of a matroid M is a maximal nonspanning set. 

The closure cl(A) (or cr(A)) of X is { x G E \ r{X U {a;}) = r(A) }. 

A set A is a closed set or Hat if cl(A') = A. 

A loop of M is an element e such that {e} is a circuit. 

If {f,g} is a circuit, then / and g are parallel elements. 

Matroid M is a simple matroid (or combinatorial geometry) if it has no loops and 
no parallel elements. 

A paving matroid is a matroid M in which all circuits have at least r(M) elements. 
Various classes of matroids are defined in Table 1: 

Matroid M is in the specified class if M satisfies the indicated condition: 

• graphic: M = M[G) for some graph G; 

• planar: M = M[G) for some planar graph G; 

• representable over F: M = M[A] for some matrix A over the field F\ 

• binary: representable over GF( 2), the 2-element field; 

• ternary: representable over GF( 3); 

• regular: representable over all fields. 
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Table 1 Classes of matroids. 


matroid 

M 

ground set 
E(M) 

independent 
sets T (M) 

bases 

B(M) 

circuits 

C(M) 

uniform 
matroid, U rn , n 
(0 < m < n) 

{1,2, ...,n} 

{ICE: 

/ < m} 

{BCE: 

\B\ = m} 

{CCE: 

\C\ = m+ 1} 

M(G), cycle 
matroid of 

graph G 

E(G), edge- 
set of G 

{I C E{G) I 
/ contains 
no cycle } 

For connected 
G: edge-sets of 
spanning trees 

edge-sets of 
cycles 

M[A], vector 
matroid of 

matrix A 
over field F 

column 
labels 
of A 

{ / C E | I labels 
a linearly indepen- 
dent multiset of 
columns } 

labels of max- 
imal linearly 
independent 
sets of columns 

labels of min- 
imal linearly 
dependent multi- 
sets of columns 

transversal 
matroid , M (A), 
of family A = 
(Ai, A 2 , . . . , A m ) 
where Aj C E 

E 

partial transvers- 
als of A: sets 

{ x ii ) • • • J x ik } 1 

i\< ... <ik 
and x i:i G Aij 

maximal partial 
transversals 
of A 

minimal sets 
that are not par- 
tial transversals 


Facts: 

1. If a matroid M is graphic, then M = M(G) for some connected graph G. 

2. Whitney’s 2-isomorphism theorem: Two graphs have isomorphic cycle matroids if 
and only if one can be obtained from the other by performing a sequence of the following 
operations: 

• choose one vertex from each of two components and identify the chosen vertices; 

• produce a new graph from which the original can be recovered by applying the 

previous operation; 

• in a graph that can be obtained from the disjoint union of two graphs G\ and G% 

by identifying vertices u± and v\ of G\ with vertices 112 and V 2 of G 2 , twist the 
graph by identifying, instead, u\ with V 2 and 112 with V\ . 

3. If A' is obtained from the matrix A over the field F by elementary row operations, 
deleting or adjoining zero rows, permuting columns, and multiplying columns by nonzero 
scalars, then M[A'] = M[A]. The converse of this holds if and only if F is GF( 2) 
or GF{ 3). 

4. If a matroid M is representable over F and r(M) > 1, then M = I D], 

where I r (M)\D consists of an r(M) x r(M) identity matrix followed by some other 
matrix D over F. 

5. A matroid M is regular if and only if M can be represented over the real numbers by 
a totally unimodular matrix (a matrix for which all subdeterminants are 0, 1, or —1). 

6. A matroid M is regular if and only if M is both binary and ternary. 

7. The smallest matroids not representable over any field have 8 elements. 

8. Conjecture : For all n, more than half of all matroids on {1,2,..., n} are paving. 
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9. The following table lists the numbers of nonisomorphic matroids, simple matroids, 
and binary matroids with up to 8 elements: 


\E(M)\ 

0 

1 

2 

3 

4 

5 

6 

7 

8 

matroids 

1 

2 

4 

8 

17 

38 

98 

306 

1,724 

simple 

1 

1 

1 

2 

4 

9 

26 

101 

950 

binary 

1 

2 

4 

8 

16 

32 

68 

148 

342 


Examples: 

1. Let M be the matroid with E{M) = {1, 2, ... ,6} and C(M) = {{1}, {5, 6}, {3,4, 5}, 
{3,4,6}}. Then B = {{2,3,4}, {2,3,5}, {2,3,6}, {2,4,5}, {2,4,6}}. The following 
figure shows that M is graphic and binary since M = M(G\) = M(£? 2 ) and M = 
M[A] with A being interpreted over GF( 2). M is regular since M = M[A] when A 
is interpreted over any field F. Also M is transversal since M = M(A) where A = 
({2}, {3, 4}, {4, 5, 6}).' 



2. Fa.no and non-Fano matroids: Given a finite set E of points in the plane and a 
collection of lines (subsets of E with at least three elements), no two of which share 
more than one common point, there is a matroid with ground set E whose circuits are 
all sets of three collinear points and all sets of four points no three of which are collinear. 
Two such matroids are shown in the following figure. Each has ground set {1,2,. ..,7}. 
On the right is the non-Fano matroid, F^~ . It differs from the Fano matroid, Ff, on the 
left by the collinearity of 4, 5, and 6 in the latter. 

The matrix in this figure represents F? over all fields of characteristic 2, and rep- 
resents F )T over all other fields. 

Fj is binary but non-ternary; F^ is ternary but non-binary. Both are non-uniform, 
non-regular, non-graphic, and non-transversal. 
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12.4.2 ALTERNATIVE AXIOM SYSTEMS 


Matroids can be characterized by many different axiom systems. Some examples of 
these systems follow. Throughout, E is assumed to be a finite set and 2 E stands for the 
set of subsets of E. 

Definitions: 

Circuit axioms: A subset C of 2 E is the set of circuits of a matroicl on E if and only 
if C satisfies: 

• 0 /g£ 

• no member of C is a proper subset of another; 

• circuit elimination: if C\, C2 are distinct members of C and e G C\ fl C2, then C 

has a member C3 such that C3 C (C\ U C2) — {e}. 

Note: The circuit elimination axiom can be strengthened to the following: 

• strong circuit elimination: if C\ , C 2 G C, f G C\ — C2, and e € C\ fl C 2 , then C 

has a member C3 such that / G C3 C (C\ U C2) — {e}. 

Basis axioms: A subset B of 2 E is the set of bases of a matroicl on E if and only if: 

• B is nonempty; 

• if Bi, B2 G B and x G B\ — B2, then there is an element y G B2 — B\ such that 

(■ Br - {x}) U {y} G B. 

Rank axioms: A function, r, from 2 E into the nonnegative integers is the rank 
function of a matroid on E if and only if, for all subsets X, Y, Z of E: 

• 0 < r{X) < \X\; 

• if Y C Z, then r(Y) < r(Z ); 

• submodularity: r(X U Y) + r{X fl Y) < r(X) + r(Y). 

Closure axioms: A function, cl, from 2 E into 2 E is the closure operator of a matroid 
on E if and only if, for all subsets X and Y of E: 

• AC cl(A); 

• if A C y, then cl(A) C cl(F); 

• cl(cl(A)) = cl(A); 

• MacLane-Steinitz exchange: if a : £ E and y G cl (A U {2}) — cl(A), then x G 

d(A U {y}). 

Fact: 

1 . If M is a matroid with ground set E and ICE, the following statements are 
equivalent: 

• I is an independent set of M; 

• no circuit of M contains /; 

• some basis of M contains /; 

• r(I) = |/|; 

• for every element e of I , e /®1 (/ — {e}). 
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12.4.3 DUALITY 


Definitions: 

For a matroid M, let B*(M) = { E(M) — B \ B € B{M) }. Then B*(M) is the set of 
bases of a matroid M*, called the dual of M, whose ground set is also E(M). 

Bases, circuits, loops, and independent sets of M* are called cobases , cocircuits, 
coloops, and coindependent sets of M. 

For a graph G, the cocycle matroid (or bond matroid ) of G is the dual of M(G) 
and is denoted by M*{G). 

A matroid M is cographic if M = M*(G) for some graph G. 

A class of matroids is closed under duality if the dual of every member of the class 
is also in the class. 

Facts: 

1. For all matroids M, (M*)* = M. 

2. For all matroids M, the rank function of M* is given by r*(X) = |X| — r(M) + 
r(E-X). 

3. The cocircuits of every matroid M are the minimal sets having nonempty intersec- 
tion with every basis of M. 

4. The cocircuits of every matroid M are the minimal nonempty sets C* such that 
\C* fl C\ ^ 1 for every circuit C of M. 

5. For every graph G, the circuits of M*[G) are the minimal edge cuts of G. 

6. A graphic matroid is cographic if and only if it is planar. 

7. The following classes of matroids are closed under duality: uniform matroids, ma- 
troids representable over a fixed field F, planar matroids, and regular matroids. The 

classes of graphic and transversal matroids are not closed under duality. 

8. The following are special sets and their complements in a matroid M and M*: 

X basis of M independent set of M circuit of M 

E — X basis of M* spanning set of M* hyperplane of M* 

Example: 

1. The following are duals of some basic examples: 


matroid 

dual 

Um,n 

M(G) (G plane) 

M[I r \D\ ([J r |D] an r x n matrix) 

Un—m,n 

M(G*), where G* is the dual of G 
M[—D T \I n - r ], same order of column 
labels as [I r D] 
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12.4.4 FUNDAMENTAL OPERATIONS 


Definitions: 

Three basic constructions for matroids M, Mi, and M 2 are defined in the following 
table. M\T and M/T are also written as M\(E — T) and M.(E — T) and are called the 
restriction and contraction of M to E — T. M\{e} and M/{e} are written as M\e 
and M/e. 


matroid 

Z 

C 

rank 

M\T 

( deletion of 

T from M) 

{I C E(M) - T | 

I G Z(M)} 

{C C E(M) — T | 

C G C(M )} 

Tm\t(X) = 

r M (X) 

M/T 

(contraction 

of T from M) 

{/ C E(M) - T | 

I U Bt G T(M) for 
some Bt in B(M\T)} 

minimal nonempty 
empty members of 
{C-T \C &C(M)} 

tm/t(X) = 

rj\t(X U T) — 
r M (T) 

Mi © M 2 

(direct sum 

of Mi and M 2 ) 

{h U I 2 Ij G Z(Mj)} 

C(Mi)uC(M 2 ) 

fM 1 ®M 2 (X) = 

ri(X n E(Mi))+ 
r 2 (XnE(M 2 )) 


Matroid N is a minor of M if N can be obtained from M by a sequence of deletions 
and contractions. The minor N is proper if N ^ M. 

A matroid is connected if it cannot be written as the direct sum of two nonempty 
matroids (matroids with nonempty ground sets). 


Facts: 

In each of the following, M, Mi, and M 2 are matroids. 

1. M\X\Y = M\(X U Y) = M\Y\X ; M/X/Y = M/(X U Y) = M/Y/X ; and 
M\X/Y = M/Y\X. 

2. Mi © M 2 = M 2 © Mi. 

3. (M/T)* = M*\T ; and ( M\T )* = M*/T. (Deletion and contraction are dual 
operations.) 

4. The scum theorem: Every minor of M can be written as M\X/Y for some in- 
dependent set Y and coindependent set X. (The name derives from the fact that an 
isomorphic copy of every simple minor of a matroid occurs at (that is, floats to) the top 
of the lattice.) (D. A. Higgs) [CrRo70] 

5. The following are equivalent: 

• M is connected; 

• M* is connected; 

• every two distinct elements of M are in a circuit; 

• there is no proper nonempty subset T of E(M) such that M\T = M/T ; 

• there is no proper nonempty subset T of E(M) such that r(T) + r(E(M) — T) = 

r(M); 

• there is no proper nonempty subset T of E(M) such that r(T) + r*(T) = \T\. 
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6 . If M is connected, then M is uniquely determined by the set of circuits containing 
some fixed element of E(M). 

7. If M is connected and e £ E(M), then M\e or M/e is connected. 

8 . F r ® Fj is not representable over any field. 

Examples: 

1- U m ,n\e = U m , n - 1 unless m = n when f7 m , n \e = 

2. U m , n /e = U m - i, n _i unless m = 0 when U m , n /e = 

3 . M(G)\e = M{G\e) where G\e is obtained from G by deleting the edge e. 

4 . M(G)/e = M(G/e ) where G/e is obtained from G by contracting the edge e. 

5 . M[A]\e is the vector matroid of the matrix obtained by deleting column e from A. 

6 . If e corresponds to a standard basis vector in A , then M[A]/e is the vector matroid 

of the matrix obtained by deleting both the column e and the row containing the one 
of e. 


12.4.5 CHARACTERIZATIONS 

Many matroid results characterize various classes of matroids. Some examples of such 
results appear below. The Venn diagram in the following figure indicates the relationship 
between certain matroid classes. 


Matroids 



Definition: 

Let Mi and M 2 be two binary matroids such that E(Mi) CiE(M 2 ) = T, Mi\T = M%\T, 
and no cocircuit of M± or M 2 is contained in T. The 2 -sum and 3 -sum of M\ and M 2 
are matroids on (E(M\) U E(M 2 )) — T whose flats are those sets F — T such that 
F D E(Mi) is a flat of Mj for i = 1,2. The 2-sum occurs when |T| = 1, \E(Mi)\ > 3, 
and T is not a loop of M», and the 3-sum occurs when \E(Mi)\ > 7 and T is a 3-element 
circuit of Mj. 
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Facts: 

1. The following are equivalent for a matroid M: 

• M is uniform; 

• every circuit of M has r(M) + 1 elements; 

• every circuit of M meets every cocircuit of M. 

2. The following are equivalent for a matroid M: 

• M is binary; 

• for every circuit C and every cocircuit C* , \ C H C* | is even; 

• for every circuit C and every cocircuit C* , \C D C*\ yf 3; 

• for all C\,C 2 G C, (Ci — C 2 ) U (C 2 — C\) is a disjoint union of circuits. 

3 . The class of regular matroids is the class of matroids that can be constructed by 
direct sums, 2-sums, and 3-sums from graphic matroids, cographic matroids, and copies 
of .Rio (the matroid that is represented over GF( 2) by the ten 5-tuples with exactly 
three ones). (This last fact is the basis of a polynomial-time algorithm to determine 
whether a real matrix is totally unimodular.) 

4 . Excluded-minor theorems: Many classes of matroids are minor-closed; that is, every 
minor of a member of the class is also in the class. Such classes can be characterized by 
listing their excluded minors (those matroids that are not in the class but have all their 
proper minors in the class). Some important examples of such results are given in the 
following table. The class of transversal matroids is not minor-closed since a contraction 
of a transversal matroid need not be transversal. 


class 

excluded minors 

class 

excluded minors 

binary 

U 2 , 4 

ternary 

U 2 , 5 ,U3,5,F 7 ,F 7 * 

uniform 

U 0 ,i 0 C/ 1,1 

graphic 

u 2 , 4 , F r , F 7 *,M*(K 5 ), M*(K 3 , 3 ) 

paving 

Uq,i 0 i/ 2,2 

regular 

u 2A ,F 7 ,F 7 * 


12.4.6 THE GREEDY ALGORITHM 

For a finite set E, let X be a subset of 2 E satisfying the first two axioms for independent 
sets in the definition of matroid (§12.4.1). Let w be a real- valued function on E. For 
X C E, let w(X), the weight of X, be J2 x ex w ( x )> and let w(0) = 0. 

Facts: 

1. Matroids have an important relationship to the greedy algorithm, Algorithm 1, that 
makes them important in optimization problems. 

2 . X (a subset of 2 E satisfying the first two axioms for independent sets in the definition 
of matroid) is the set of independent sets of a matroid on E if and only if, for all real- 
valued weight functions w on E, the set Bq produced by the greedy algorithm is a 
maximal member of X of maximum weight. 
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Algorithm 1: The greedy algorithm for (X,w). 

X 0 := 0; j := 0 

while E — Xj contains an element e such that Xj U {e} £ X 

ej+ 1 := an element e of maximum weight such that Xj U {e} € X 
Xj+i := Xj U {e j+ i} 

j ■= j + 1 

:= Xj 


Example: 

1. Let G be a connected graph with each edge e having a cost c(e). Define u>(e) = — c(e). 
Then the greedy algorithm is just Kruskal’s algorithm (§10.1.2) and Bq is the edge-set 
of a spanning tree of minimum cost. 


REFERENCES 

Printed Resources: 

[An93] I. Anderson, Combinatorial Designs: Construction Methods , Ellis Horwood, 
1993. 

[BeJuLe86] T. Beth, D. Jungnickel, and H. Lenz, Design Theory, Cambridge University 
Press, 1986. 

[CoDi96] C. J. Colbourn and J. H. Dinitz, eds., The CRC Handbook of Combinatorial 
Designs, CRC Press, 1996. (A comprehensive source of information on combinatorial 
designs.) 

[CrRo70] H. H. Crapo and G.-C. Rota, On the Foundations of Combinatorial Theory: 
Combinatorial Geometries, MIT Press, 1970. 

[DeKe74] J. Denes and A. D. Keedwell, Latin Squares and Their Applications, Academic 
Press, 1974. 

[DiSt92] J. H. Dinitz and D. R. Stinson, eds., Contemporary Design Theory: A Collection 
of Surveys, John Wiley & Sons, 1992. 

[0x92] J. G. Oxley, Matroid Theory, Oxford University Press, 1992. (The main source 
for §12.4; contains references for all stated results.) 

[Re89] A. Recski, Matroid Theory and Applications in Electric Network Theory and 
Statics, Springer- Verlag, 1989. 

[StSt87] A. P. Street and D. J. Street, Combinatorics of Experimental Design, Clarendon 
Press, 1987. 

[Wa88] W. D. Wallis, Combinatorial Designs, Marcel Dekkcr, 1988. 

[We76] D. J. A. Welsh, Matroid Theory, Academic Press, 1976. 

[Wh86] N. White, eel., Theory of Matroids, Cambridge University Press, 1986. 


© 2000 by CRC Press LLC 




Web Resources: 

http://cacr.math.uwaterloo.ca/ dstinson/papers/designnotes .ps (Contains a 
complete set of notes (approximately 100 pages, 615K, in postscript) from Doug 
Stinson for a “Combinatorial Designs with Selected Applications” course.) 

http://gams.nist.gov/ (Guide To Available Mathematical Software at the National 
Institute of Science and Technology: a cross-index and virtual repository of math- 
ematical and statistical software components of use in computational science and 
engineering; contains a program for finding t-designs.) 

http : //kapis .www.wkap.nl/kapis/CGI-BIN/WORLD/journalhome ,htm?0925-1022 
(Designs, Codes and Cryptography has a site with its table of contents.) 

http : //lib . stat . emu . edu/designs/ (The Designs Archive at Statlib: has some very 
useful programs for making orthogonal arrays.) 

http://members.aol.com/matroids (Contains information on matroids, has a soft- 
ware package for binary matroids, and has links to several other sites on matroid 
theory.) 

http://sdccl2.ucsd.edu/ xm3dg/cover .html (La Jolla Covering Repository: con- 
tains coverings C(v, k , t) with v < 32, k < 16, t < 8, and less than 5,000 blocks.) 

http://winnie.math.tu-berlin.de/~ziegler (Source of information on oriented 
matroids.) 

http : //www. cecm. sfu . ca/organics/papers/lam/paper /html /paper .html (Infor- 

mation on the search for a finite projective plane of order 10.) 

http : //www . cse . cuhk . edu . hk/ luk036/mdc/ (Contains tables of minimal difference 
covers.) 

http : / / www . emba . uvm . edu/ j cd/ (Journal of Combinatorial Designs: has information 
including its contents back to volume 1.) 

http : // www . emba . uvm . edu/ dinitz/hed . html (The Handbook of Combinatorial De- 
signs has a website that lists new results in design theory that have been discovered 
since its 1996 publication.) 

http://www.netlib.org/ (Netlib Repository: a collection of mathematical software, 
papers, and databases.) 

http://www.research.att.com/ njas/gosset/index.html (GOSSET: A general 
purpose program for designing experiments.) 

http://www.research.att.com/ njas/hadamard/ (Contains an extensive collection 
of Hadamard matrices.) 

http://www.research.att.com/ njas/oadir/index.html (Contains an extensive 
collection of orthogonal arrays.) 

http://www.utu.fi/ honkala/designs .ps (Notes from Ian Anderson and Iiro Hon- 
kala for a “Short Course in Combinatorial Designs”; about 40 pages, 272K, in post- 
script.) 


© 2000 by CRC Press LLC 



DISCRETE AND COMPUTATIONAL 
GEOMETRY 


13.1 Arrangements of Geometric Objects lleana Streinu 

13.1.1 Point Configurations 

13. 1.2 Line and Hyperplane Arrangements 

13.1.3 Pseudoline Arrangements 

13. 1.4 Oriented Matroids 

13.2 Space Filling Karoly Bezdek 

13.2.1 Packing 

13.2.2 Covering 

13.2.3 Tiling 

13.3 Combinatorial Geometry Janos Pach 

13.3.1 Convexity 

13.3.2 Incidences 

13.3.3 Distances 

13.3.4 Coloring 

13.4 Polyhedra Tamal K. Dey 

13.4. 1 Geometric Properties of Polyhedra 

13.4.2 Triangulations 

13.4.3 Face Numbers 

13.5 Algorithms and Complexity in Computational Geometry Jianer Chen 

13.5.1 Convex Hulls 

13.5.2 Triangulation Algorithms 

13.5.3 Voronoi Diagrams and Delaunay Triangulations 

13.5.4 Arrangements 

13.5.5 Visibility 

13.6 Geometric Data Structures and Searching Dina Kravets 

13.6.1 Point Location 

13.6.2 Range Searching 

13.6.3 Ray Shooting and Lines in Space 

13.7 Computational Techniques Nancy M. Amato 

13.7.1 Parallel Algorithms 

13.7.2 Randomized Algorithms 

13.7.3 Parametric Search 

13.7.4 Finite Precision 

13.7.5 Degeneracy A voidance 


© 2000 by CRC Press LLC 


W. Randolph Franklin 


13.8 Applications of Geometry 
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13.8.4 Motion Planning in Robotics 

13.8.5 Convex Hull Applications 
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INTRODUCTION 

This chapter outlines the theory and applications of various concepts arising in two 
rapidly growing, interrelated areas of geometry: discrete geometry (which deals with 
topics such as space filling, arrangements of geometric objects, and related combinatorial 
problems) and computational geometry (which deals with the many aspects of the design 
and analysis of geometric algorithms). A more extensive treatment of discrete and 
computational geometry can be found in [GoO’R97]. 


GLOSSARY 

anti-aliasing: the filtering out of high-frequency spatial components of a signal, to 
prevent artifacts, or aliases, from appearing in the output image. 

aperiodic (prototile) : a prototile in d-dimensional Euclidean space such that the pro- 
totile admits a tiling of the space, yet all such tilings are nonperiodic. 

arrangement (of lines in the plane) : the planar straight-line graph whose vertices are 
the intersection points of the lines and whose edges connect consecutive intersection 
points on each line (it is assumed that all lines intersect at a common point at 
infinity) . 

arrangement graph: a graph associated with a Euclidean or projective line arrange- 
ment, or a big circle arrangement. 

aspect ratio (of a simplex) : the ratio of the radius of the circumscribing sphere to the 
radius of the inscribing sphere of the simplex; for a triangulation, the largest aspect 
ratio of a simplex in the triangulation. 
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basis (of a point configuration in 7Z d ): a subset of the point configuration that is a 
simplex of the ambient space 7Z d . 

basis (of a vector configuration in 7Z d ): a subset of the vector configuration that is a 
basis of the ambient space lZ d . 

big-circle arrangement : the intersection of a central plane arrangement with the 
unit sphere in TZ 3 . 

boundary (of a polyhedron): the vertices, edges, and higher dimensional facets of the 
polyhedron. 

cell (of a line arrangement): a connected component of the complement in 1Z 2 of the 
union of the points on the lines. 

centerpoint (of a point configuration P of size n): a point q , not necessarily in P, 
such that for any hyperplane containing q there are at least [ gxy] points in each 
semi-space induced by the hyperplane. 

central hyperplane arrangement (in TZ d ): a finite set of central hyperplanes, not 
all of them going through the same point. 

central plane arrangement (in TZ 3 ): a finite set of central planes. 

chain: a planar straight-line graph with vertices v \, . . . , v n and edges {r’i, {^ 2 , ^ 3 }, 

■ - ■ • , v n } . 

chirotope: for an ordered point configuration, the set of all signed bases of the con- 
figuration; for an ordered vector configuration, the set of all signed bases of the 
configuration. 

circuit (of a set of labeled vectors V = {i>i, . . . , v n }): the signed set C = ( C + ,C~ ), 
where C + = { j \ oij > 0} and C~ = { j \ otj < 0}, of indices of the non- 
null coefficients ctj in a minimal linear dependency V' = {v^ , . . . , Vi k } of V with 

EjU a o v h = 0- 

class library: in an object-oriented computer language, a set of new data types and 
operations on them, activated by sending a data item a message. 

closed halfspace: the set of all points on a hyperplane and the points on one side of 
the same hyperplane. 

cluster of rank 3 hyperline sequences (associated with a vector configuration): 
the ordered set of stars, one for each point in the configuration. 

cluster of stars (associated with a point configuration): the ordered set of stars, one 
for each point in the configuration. 

cocircuit (of a labeled vector configuration V): a signed set C = ( C + ,C ~ ) of the set 
{1,2,..., n}, induced by a subset of d — 1 vectors spanning a central hyperplane h. 
For an arbitrary orientation of h, C + is the set of indices of elements in V lying 
in h + and C is the set of indices of elements in V lying in h . 

computational convexity: the study of high-dimensional convex bodies. 

contraction (on element i in a rank d central plane arrangement): the arrangement 
obtained by identifying hi with 7 Z^ 1 and intersecting it with all the other hyper- 
planes to obtain a rank d—1 arrangement with one element less. 

convex: property of a subset of a Euclidean space that for every pair of points in the 
set the linear segment joining them is contained in the set. 

convex body: a closed and bounded convex set with nonempty interior. 
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convex d-polyhedron: the intersection of a finite number of closed halfspaces in 7 Z d . 

convex decomposition (of a polyhedron): its partition into interior disjoint convex 
pieces. 

convex hull (of a set of points): the smallest convex set containing the given set of 
points. 

convex polygon: a polytope in the plane. 

convex polyhedron : the intersection of a finite number of half-spaces. 

convex polytope: a bounded convex polyhedron. 

convex position: property of a set of points that it is the vertex set of a polytope. 

convex set: a subset of d-dimensional Euclidean space such that for every pair of 
distinct points in the set, the segment with these two points as endpoints is contained 
in the set. 

covering: a family of convex bodies in d-dimensional Euclidean space such that each 
point belongs to at least one of the convex bodies. 

cyclic d-polytope: the convex hull of a set of n > d + 1 points on the moment curve 
in lZ d . The moment curve in lZ d is defined parametrically by x(t) = ...,t d ). 

Davenport-Schinzel sequence (of order s): a sequence of characters over an alpha- 
bet of size n such that no two consecutive characters are the same, and for any pair 
of characters, a and b, there is no alternating subsequence of length s + 2 of the form 

deletion: the removal of a point (vector, line, etc.) from a configuration and recording 
the oriented matroid (chirotope, circuits, cluster of stars, etc.) only for the remaining 
points. 

density (of a covering): the common value (if it exists) of the lower density and upper 
density of the covering. 

density (of a packing) : the common value (if it exists) of the lower density and upper 
density of the packing. 

dual polytopes: two polytopes P and Q such that there exists a one-to-one corre- 
spondence 5 between the set of faces of P and Q where two faces fi, f 2 G P satisfy 
fi C / 2 if and only if d(/i) D S(f 2 ) in Q. 

duality transformation: a mapping of points to lines and lines to points that pre- 
serves incidences. 

Euclidean hyperplane arrangement (in lZ d ): a finite set of affine hyperplanes, not 
all of them going through the same point. 

Euclidean line arrangement: a finite set of planar lines, not all of them going 
through the same point. 

Euclidean pseudoconfiguration of points: a pair consisting of a planar set of points 
and a pseudoline arrangement, such that for every pair of distinct points there exists 
a unique pseudoline incident with them. 

k-face: an open set of dimension k that is part of the boundary of a polyhedron. 
(0-faces, 1-faces and (d— l)-faces of a d-polyhedron are called vertices , edges, and 
facets. 

face vector (of a d-polyhedron): the d-dimensional vector (/o, / 1 , •••, /d-i), where /) 
is the number of i-dimensional faces of the d-polyhedron. 
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face-to-face (tiling): a tiling of d-dimensional Euclidean space by convex d-polytopes 
such that the intersection of any two tiles is a face of each tile, possibly the (improper) 
empty face. 

general position: property of a set of vectors that every subset of d elements is a 
basis; property of a set of points that every subset of d elements is a simplex. 

genus (of a manifold 3-polyhedron): the genus number of its boundary, if the boundary 
is a 2-manifold. 

geographic information system ( GIS) : an information system designed to capture, 
store, manipulate, analyze, and display spatial or geographically-referenced data. 

geometric constraint solving: the problem of locating a set of geometric elements 
given a set of constraints between them. 

graphical user interface (GUI): a mechanism that allows a user to interactively 
control a computer program with a bitmapped display by using a mouse or pointer 
to select menu items, move sliders or valuators, etc. 

Grassmann-Pliicker relations (rank 3): the identities [123] [145] — [124] [135] + 
[125] [134] = 0 satisfied by the determinants [ijk] = det(uj, Vj,Vk ), for any five vectors 
Vi, 1 < i < 5 . 

half-space: one of the two connected components of the complement of a hyperplane. 

ham-sandwich cut: a hyperplane that simultaneously bisects d point configurations 
in d-dimensional Euclidean space. 

hyperplane: in d dimensions the set of all points on a (d-l)-dimensional plane. 

hyperplane arrangement: the partitioning of the Euclidean space lZ d into connected 
regions of different dimensions (vertices, edges, etc.) by a finite set of hyperplanes. 

isogonal (tiling): a tiling such that the group of symmetries acts transitively on the 
vertices of the tiles. 

isomorphic (vector or point configurations): configurations having the same order 
type, after possibly relabeling their elements. 

isotoxal (tiling): a tiling such that the group of symmetries acts transitively on the 
edges of the tiles. 

lattice (tiling): a tiling of d-dimensional Euclidean space by translates of a tile such 
that the corresponding translation vectors form a d-dimensional lattice. 

k-level (in a nonvertical arrangement of n lines): the lower boundary of the set of 
points in 1Z 2 having exactly k lines above and n—k below. 

line arrangement: the partitioning of the plane into connected regions (cells, edges 
and vertices) induced by a finite set of lines. 

.. . . V Vol (Ki) 

lower density (of a covering C): v(C) = i? ™“ 00 — 1 *b r ) > where each K, is a 

convex body in the covering C of d-dimensional Euclidean space and Bn is the closed 
ball of radius R centered at the origin. 

V Vol (K t ) 

lower density (of a packing V): 6(V) = n™+oo — ^ vbi (b r ) where each Ki is a 

convex body in the packing V of d-dimensional Euclidean space and Bn is the closed 
ball of radius R centered at the origin. 

lower envelope (of a nonvertical line arrangement) : the half-plane intersection of the 
half-planes below the lines of the arrangement. 
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manifold d-polyhedron : a polyhedron whose boundary is topologically the same as 
a (d— l)-manifold; i.e. , every point on the boundary of a manifold d-polyhedron has 
a small neighborhood that looks like an open d-ball. 

mathematical programming : the large-scale-optimization of an objective function 
of many variables subject to constraints. 

minor (of an oriented matroicl given by hyperline sequences): an oriented matroicl 
obtained by a sequence of deletions and/or contractions. 

monohedral (tiling): a tiling T of d-dimensional Euclidean space in which all tiles are 
congruent to one fixed set T, the (metrical) prototile of T. 

nonconvex polyhedron : the union of a set of convex polyhedra such that the under- 
lying space is connected and nonconvex. 

non-manifold d-polyhedron: a d-polyhedron that does not have manifold boundary. 

nonperiodic (tiling): a tiling such that its group of symmetries contains no translation 
other than the identity. 

normal (tiling): a tiling of d-dimensional Euclidean space by convex polytopes such 
that there exist positive real numbers r and R such that each tile contains a Euclidean 
ball of radius r and is contained in a Euclidean ball of radius R. 

oracle: an algorithm that gives information about a convex body 

order type (of a vector or point configuration) : the collection of all semi-spaces of the 
configuration. 

oriented matroid: a pair A4 = (n, £), where £, the set of covectors of A4 , is a subset 
of , 0}" and satisfies the properties: 0 € £; if X G £, then —X G £; if X, Y G £, 
then X o Y G £; if X, Y G £ and i G S(X,Y) = {/ | X, : = —Yi ^ 0}, then there is 
Z G £ such that Zi = 0; for each j /eS(X, Y), Zj = (X o Y)j = (Y o X)j. 

oriented matroid given by a chirotope: an abstract set of points labeled {1, . . . , n } , 
together with a function satisfying the chirotope axioms. 

packing: a family of convex bodies in d-dimensional Euclidean space such that no two 
have an interior point in common. 

parallel algorithm: an algorithm that concurrently uses more than one processing 
element during its execution. 

parallel random access machine (PRAM): a synchronous machine in which each 
processor is a sequential RAM, and processors communicate using a shared memory. 

parametric search: an algorithmic technique for solving optimization problems. 

periodic (tiling): a tiling of d-dimensional Euclidean space such that the group of all 
symmetries of the tiling contains translations in d linearly independent directions. 

planar straight-line graph: a planar graph such that each edge is a straight line. 

point configuration (of dimension d): a finite set of points affinely spanning 7Z ( . 

point location problem : the problem of determining which region of a given subdi- 
vision of 7Z d contains a given point. 

polar-duality of vectors and central planes (in 7Z 3 ): a mapping associating with 
a vector v in 1Z 3 an oriented central plane h having v as its normal vector, and vice 
versa. 

polyhedron: the intersection of a finite number of closed half-spaces in d-dimensional 
Euclidean space. 
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polytope: a bounded polyhedron. 

d-polytope: a convex d-polyhedron for which there exists a d-dimensional cube con- 
taining it inside; that is, a bounded convex d-polyhedron. 

TL-polytope: a polytope defined as the intersection of d half-spaces in d-dimensional 
Euclidean space. 

V -polytope: a polytope defined as the convex hull of d points in d-dimensional Eu- 
clidean space. 

projective line arrangement: a finite set of projective lines in the projective plane. 

prototile: the single tile used repeatedly in a monohedral tiling. 

pseudoline arrangement: a finite collection of simple planar curves that intersect 
pairwise in exactly one point, where they cross. 

Radon partition (of a set of labeled points P): a signed set C = (C + , C~) of points 
of P such that the convex hull of the points in C + intersects the convex hull of the 
points in C - . 

randomized algorithm: an algorithm that makes random choices during its execu- 
tion. 

range counting problem: the problem of counting the number of points of a given 
set of points that lie in a query range. 

range emptiness problem: the problem of determining if a query range contains any 
points of a given set of points. 

range reporting problem : the problem of determining all the points of a given set 
of points that lie in a query range. 

rank 3 hyperline sequence (associated with a vector v £ V C 1Z 3 ): an alternating 
circular sequence of subsets of indices in E n obtained by rotating an oriented central 
plane in counterclockwise order around the line through v. 

ray: a half-line that is directed away from its endpoint. 

ray shooting problem : the problem of determining the first object in a set of geo- 
metric objects that is hit by a query ray. 

real random access machine ( real RAM): a model of computation in which values 
can be arbitrarily long real numbers, and all standard operations such as +,— , x, 
and -r can be performed in unit time regardless of operand length. 

realizable (pseudoline arrangement): a pseudoline arrangement isomorphic to a line 
arrangement. 

reflex edges: edges of a nonconvex 3-polyhedron that subtend an inner dihedral angle 
greater than 180°. 

regular (polygon): a polygon with all sides congruent and all interior angles equal. 

regular (polytope): a d-polytope (d > 0) with all its facets regular (d— l)-polytopes 
that are combinatorially equivalent; a regular 0-polytope is a vertex. 

regular (tiling): a monohedral tiling of the plane with a regular polygon as prototile. 

reorientation (of a vector configuration V = {iq, . . . , w„}): a vector configuration 
V' = {uj, . . . , v' n } such that each v[ is equal to iq or — iq. 

semiregular (polyhedron): a convex polyhedron with each face a regular polygon, but 
where more than one regular polygon can be used as a face. 
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semiregular (tiling): a tiling of the plane using n prototiles with the same numbers 
of polygons around each vertex. 

semi-space (of a configuration induced by a hyperplane): the set of indices of the 
configuration lying on one side of the hyperplane. 

semi-space (of a vector or point configuration): a semi-space induced by some hyper- 
plane. 

k-set (of a point configuration): a semi-space of the configuration of size k. 

d-dimensional simplex (or d-simplex ): a d-polytope with d + 1 vertices. 

simplicial complex: a triangulation of a polyhedron such that for any two simplices 
in the triangulation, either the intersection of the simplices is empty or is a face of 
both simplices. 

simplicial polytope: a polytope in which all faces are simplices. 

( standard affine) pseudo polar-duality : the association between an x-monotone 
pseudoline arrangement L = {l\, . . . ,l n } given in slope order and a pseudo configura- 
tion of points (P, L ') , P = {pi , . . . , p. n }, given in increasing order of the x-coordinates 
and with L' being x-monotone, satisfying the property that the cluster of stars as- 
sociated with L and to P are the same. 

straight line dual: given the Voronoi diagram of a set {pi, . . . ,p n } of points in the 
plane, the planar straight-line graph whose vertices are the points in the set, with 
two vertices pi and pj adjacent if and only if the regions V(pi) and V(pj) share a 
common edge. 

strictly convex: the property of a convex set that its boundary contains no line 
segment. 

symmetry (of a tiling): a Euclidean motion that maps each tile of the tiling onto a 
tile of the tiling. 

tile : an element of a tiling. 

tiling (of Euclidean d-space) : a countable family T of closed topological d-cells of lZ d 
that cover 1Z d without gaps and overlaps. 

triangulation (of a d-polyhedron) : a convex decomposition where each convex piece 
of the decomposition is a d-simplex. 

triangulation (of a simple polygon): an augmentation of the polygon with non-inter- 
secting diagonal edges connecting vertices of the polygon such that in the resulting 
planar straight-line graph every bounded face is a triangle. 

uniform chirotope : a chirotope function that takes nonzero values on all d-tuples. 

.. V Vol(ifi) 

upper density (of a covering C): V(C) = — K<r ^j 1 {b r ) > where each Ki is 

a convex body in the covering C of d-dimensional Euclidean space and Br is the 
closed ball of radius R centered at the origin. 

V Vol(iCi) 

upper density (of a packing V): S(V) = — Ki voi(B R j where each K L is a 

convex body in the packing V of d-dimensional Euclidean space and Br is the closed 
ball of radius R centered at the origin. 

upper envelope (of a nonvertical line arrangement): the half-plane intersection of the 
half-planes above the lines of the arrangement. 

vector configuration (of dimension d): a finite set of vectors spanning lZ d . 
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visibility graph: given n nonintersecting line segments in the plane, the graph whose 
vertices are the endpoints of the line segments, with two vertices adjacent if and only 
if they are visible from each other. 

visibility problem: the problem of finding what is visible, given a configuration of 
objects and a viewpoint. 

Voronoi cell (with center c,): the convex polyhedral set V. = { x £ lZ d : \x — c,| = 
min j | x — Cj } , where ci , C 2 , . . . are centers of unit balls in a packing of d-dimensional 
Euclidean space. 

Voronoi diagram (of a set of points {pi, . . . ,p n } in d-dimensional Euclidean space): 
the partition of d-dimensional Euclidean space into convex polytopes V ( Pi ) such that 
V(pi) is the locus of points that are closer to pi than to any other point in pj. 

zone (of a line in an arrangement): the set of cells of the arrangement intersected by 
the line. 

zonotope: the vector (Minkowski) sum of a finite number of line segments. 


13.1 ARRANGEMENTS OF GEOMETRIC OBJECTS 

A wide range of applied fields (statistics, computer graphics, robotics, geographical 
databases) depend on solutions to geometric problems: polygon intersection, visibility 
computations, range searching, shortest paths among obstacles, just to name a few. 

These problems typically start with “consider a finite set of points (or lines, seg- 
ments, curves, hyperplanes, polygons, polyhedra, etc.)”. The combinatorial properties 
of these sets, or arrangements, of objects (incidence, order, partitioning, separation, con- 
vexity) set the foundations for the algorithms developed in the field of computational 
geometry. 

In this chapter attention is focused on the most studied and best understood ar- 
rangements of geometric objects: points, lines and hyperplanes. Introducing the con- 
cepts relies on linear algebra. The combinatorial properties studied belong however 
to a relatively new field, the theory of oriented matroids, which has sometimes been 
described as linear algebra without coordinates. 

Several fundamental types of questions are asked about these arrangements. The 
most basic is the classification problem, whose goal is to find combinatorial parameters 
allowing the partitioning of the (uncountable) set of all possible arrangements of n ob- 
jects into a finite number of equivalence classes. Examples of such structures for point 
and line arrangements include semi-spaces, Radon partitions, chirotopes, hyperline se- 
quences, etc. They satisfy simple properties known as axiomatic systems for oriented 
matroids, which lead to the definition of an abstract class of objects generalizing finite 
point and vector sets. In dimension 2 oriented matroids can be visualized topologically 
as pseudoline arrangements. The numerous definitions needed to introduce arrange- 
ments and oriented matroids will be complemented in this section by the most important 
facts, such as counting the number of finite point, line and pseudoline arrangements, 
deciding when a pseudoline arrangement is equivalent to a line arrangement, and basic 
algorithmic results. 
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13.1.1 POINT CONFIGURATIONS 


The simplest geometric objects are points in some d-dimensional space. Most of the 
other objects of interest for applications of computational geometry (sets of segments, 
polygons, polyhedra) are built on top of, and inherit, geometric structure from sets of 
points. 

The setting for computational geometry problems is the Euclidean (affine) space lZ d 
and most of its fundamental concepts (convexity, proximity) belong here in a natural 
way. However, some standard techniques, such as polarity and duality, as well as the 
abstraction to oriented matroids, are better explained in the context of vector spaces. 

Several categories of concepts are introduced in this section, and developed and 
used in the subsequent subsections: vector and point configurations, hyperplanes and 
half-spaces, convexity, and some combinatorial parameters associated with vector or 
point configurations, relevant to applications (in statistics, pattern recognition or com- 
putational geometry): signed bases, semi-spaces, ft-sets, centerpoints. 

Definitions: 

The ( standard ) real vector space of dimension d is the vector space lZ d = { x \ x = 
(aq, . . . , Xd),Xi € 1Z}, with vector addition x + y = {aq + yi , . . . , Xd + yd} and scalar 
multiplication ax = {ccaq, . . . , axd}- A vector in lZ d is a d-dimensional vector. 

A linear combination of a set of vectors {tq, . . . , v n } is a vector of the form oqtq, 
for coefficients aq, . . . , a n € 1Z. 

A linearly independent set of vectors is a set of vectors {iq, . . . , Ufc} such that 
a linear combination of them equals the zero vector (©) , aqtq = 0) if and only if 
aq = 0 for all i = 1, . . . , k. 

A basis of lZ d is a maximal set of linearly independent vectors, i.e. , one that is no 
longer independent if a new element is added. 

A basis is an ordered basis if it is given as an ordered set. 

The sign of an ordered basis is the sign of the determinant of the d x d matrix with 
columns given in order by the vectors of the ordered basis. 

A linearly dependent set of vectors V = {/tq, . . . ,Vk} is a set of vectors for which 
there exists a linear combination with at least one nonzero coefficient yielding the 0 
vector; i.e., X),f =1 a,iq = 0 with some aq ^ 0. 

The linear space spanned by a set of vectors V = {tq, . . . , Vk }, tq € 1Z d , is the set of 
all linear combinations of vectors of V. 

A linear k-dimensional subspace of lZ d (k < d) is the set of all linear combinations 
of k linearly independent vectors Vi, ... , Vk in lZ d . 

A line through v £ lZ d is the 1-dimensional linear subspace of lZ d induced by v ^ 0. 

Euclidean space of dimension d is Hr seen as an affine space. It is sometimes 
identified with the d-dimensional affine hyperplane Xd+i = 1 in lZ d+1 . 

A ( d-dimensional ) point is an element of lZ d seen as a Euclidean space. 

An affine combination of a set of points {p i, . . . ,p n } is a point of the form Xq=i a iPi> 
with aq G 1Z and oq = 0. 

An affinely independent set of points is a set of points { /q , . . . , qq, } such that no 
point is an affine combination of the others. 
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A simplex of TZ d is a maximal set of affinely independent vectors. It is an ordered 
simplex if it is given as an ordered set. 

The extended matrix of an ordered simplex {pi, . . . ,Pd+ 1} is the (d + 1) x (d + 1) 
matrix with its itli column equal to (p i? 1). 

The sign of an ordered simplex is the sign of the determinant of the extended matrix 
of the simplex. 

An affinely dependent set of points P = {pi , . . . ,pk} is a set of points such that one 
of the points is an affine combination of the others. 

The affine space spanned by a set of points P = {pi, . . . ,Pk}, with p.j £ lZ d , is the set 
of all affine combinations of points of P. It is an affine subspace of TZ d . 

The affine k-dimensional subspace of TZ d (k < d) is the set of all affine combinations 
of k affinely independent points pi, . . . ,pk in 7Z d . 

A linear function is a function h : lZ d — > 1Z, such that h(x 1 , . . . , Xd ) = 1 a-iXi+dd+i- 

A linear function is homogeneous if a<j+i = 0. 

The affine hyperplane induced by a linear function h is the set h° = { x £ 7Z d \ 
h(x) = 0 }. 

An affine hyperplane is a central hyperplane if h is a homogeneous linear function. 

An oriented hyperplane is a hyperplane, together with a choice of a positive side for 
the hyperplane. This amounts to choosing a (homogeneous or affine) linear function h 
to generate it, together with all those of the form ah, a > 0. 

An reorientation of an oriented hyperplane is a swapping of the negative and positive 
sides of the hyperplane (or, changing the generating linear function h to —h). 

The open half-spaces induced by an oriented hyperplane h are h + = { x \ h(x ) > 0 } 
(the positive side of h°) and h ~ = { x \ h(x) <0} (the negative side of h°). The 
sets h + , h°, and h~ form a partition of 7Z d : TZ d = h + U h~ U h°, and h + , h~ , and h° 
are pairwise disjoint. 

The closed half-spaces induced by h are h + U h° and h~ U ft 0 . 

A convex combination of a set of points {pi, . . . ,p n } is a point of the form ^"=1 a i'Pi 
with ai £ 1Z, ai > 0, and = 1- 

The segment with endpoints pi 7^ p 2 is the set of all convex combinations of pi and P2. 

A set of points {pi , . . . ,pk} is convexly independent if no point is a convex combi- 
nation of the others. The points are also said to be in convex position. 

A convex set in lZ d is a set S C lZ d such that if pi and P2 are distinct points in S, 
then the segment with endpoints pi and P2 is contained in S. 

The convex hull of a finite set of points P is the set of all convex combinations of 
points of P. 

A convex polytope is the convex hull of a finite set of points. Its boundary consists 
of faces of dimension 0 (vertices), 1 (edges), ... , d — 1 (facets). 

The face description of a convex polytope is a data structure storing information 
about all the faces and their incidences. 

A convex polygon is the convex hull of a finite set of points in 1Z 2 . 

A vector configuration [point configuration] of dimension d is a finite set of n 
vectors {vi,...,u„} (u* £ lZ d ) spanning lZ d [points {pi,...,p„} (pi £ 1 Z d ) affinely 
spanning lZ d ], 
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A configuration is labeled if its elements are given as an ordered set. (It may be given 
as the set of columns of a d x n matrix with real entries.) 

The rank of a vector configuration [point configuration] in lZ d is the number d [d + 1] . 

A set of vectors [points] in lZ d is in general position if every subset of d elements is a 
basis [simplex]. 

An affine configuration (or acyclic vector configuration ) is a configuration with 
a central hyperplane containing all the vectors of the configuration on one side. 

A reorientation of a vector configuration V = {'t’l , . . . , v n } is a vector configuration 
V' = {vi, . . . ,v' n } with each v[ equal to either ty or — iy. 

A reorientation class is the set of all labeled vector configurations which are reorien- 
tation equivalent. 

A point configuration P C lZ d induced by an acyclic vector configuration 

V C lZ d+1 contained in a half-space h + is the set of all points p obtained as follows: 
take the affine plane h! in h + parallel to h and tangent to the unit sphere S d in !Z d+1 -, 
the intersection of the line through vector v £ V with the plane h! is a point p € h! . lZ d 
is identified with the affine plane h! . 

A semi-space of a vector configuration [point configuration] V induced by an oriented 
central hyperplane [affine hyperplane] h is the set of indices of the elements in V lying 
on one side of the hyperplane. 

A semi-space of a vector configuration [point configuration] V is a semi-space 
induced by some hyperplane h. 

The order type of a vector or point configuration V is the collection of all semi-spaces 
of V. 

Isomorphic vector or point configurations are configurations having the same order 
type, after possibly relabeling their elements. 

A k-set of a point configuration P is a semi-space of P of size k (0 < k < n). 

A centerpoint of a point configuration P of size n is a point q , not necessarily in P, 
such that if h is a hyperplane containing q there are at least |" gqip] points in each 
semi-space induced by h. 

A ham-sandwich cut is a hyperplane that simultaneously bisects d point configura- 
tions Pi, P 2 , ■ ■ ■ ,Pd in TZ d . 


Facts: 

1. A basis of the vector space lZ d has d elements and a simplex of the affine space lZ d 
has d + 1 elements. 

2. The rank of the d x n matrix associated with an n- vector configuration in lZ d is d; 
the rank of the matrix associated with a point configuration in lZ d , extended with a row 
of Is, is d + 1. 

3. The determinant of a d x d matrix whose columns are a basis of 1Z d and the deter- 
minant of the (d+1) x (d+1) extended matrix of a simplex in lZ d are nonnull. 

4. If {v\,...,Vd) is a basis, then for any vector Vd+i, {iq, . . . , Vd, Vd+i} is linearly 
dependent. 

5. The intersection of linear subspaces [affine subspaces, convex subspaces] of lZ d is 
linear [affine, convex]. 
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6. The intersection of k central hyperplanes [affine hyperplanes] in lZ d ( k < d) is a 
linear subspace [affine subspace]. Its dimension is at least d — k. 

7 . Every affine subspace of lZ d is a convex set. 

8. Caratheodory’s theorem: Each point in the convex hull of a set of points P C lZ d 
lies in the convex hull of a subset of P with at most d + 1 points. 

9. Radon’s theorem: Each set of at least d + 2 points in lZ d can be partitioned into 
two disjoint sets whose convex hulls intersect in a nonempty set. 

10 . Helly’s theorem: Let S be a finite family of n convex sets in lZ d (n > d + 1). If 
every subfamily of d + 1 sets in S has a nonempty intersection, then there is a point 
common to all the sets in S. 

11 . Every point configuration admits a centerpoint. 

12 . For every d configurations of points in lZ d , there exists a ham-sandwich cut. 

13 . Upper bound theorem: The number of facets of the convex hull of an n-point 
configuration in 7 Z d is 0(nL d / 2 J). This bound is obtained for configurations of points on 
the moment curve P = { p(t) | t € {t l5 . . . , t n } C }, where p(t) = (t, t 2 , . . . , t d ) € lZ d . 

14 . The number of semi-spaces of a rank d + 1 vector or point configuration of n 
elements is 0(n d ). The maximum is attained for points in general position. 

15 . For d = 2, let efc(n) be the maximum number of fc-sets of any planar n-point 
configuration. Then fl(nlogfc) < efc(n) < 0(n/c3). 

16 . Erdos-Szekeres problem: If c(k ) is the maximum number of planar points in general 
position such that no k are in convex position, then 2 k ~ 2 < c(k) < 

17 . The face description of the convex hull of a point set of size n in lZ d can be 
computed optimally in Oln log n) time if d = 2 or d = 3, and 0(n^ J) for d > 3. 

18 . A lram-sandwich cut in dimension 2 can be found in linear time. 


Examples: 

1. The configuration of points in the following figure is given by the columns of the 
/ o — 1 3 1 2 \ 

matrix ^ ^ ^ ^ 1 . It is not in general position: the three points 1, 4, and 5 are 


0-1312 


collinear. The extended matrix is 0 1 0 1 2 . Because det 0 1 0 < 0, 


11111 


0-13 


111 


the simplex 123 is negative. Some semi-sets are: {3} (1-set), {2,5} (2-set), {1,3,4} 
(3-set), etc. The convex hull is {1, 3, 5, 2}. 
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2. The two configurations of points from the figure of Example 1 and part (a) of the 
following figure are isomorphic, but those in parts (a) and (b) of the following figure 
are not. This can be seen because, for example, they have different numbers of points 
on their convex hulls. 


5 4 

• • 


5 






4 


1 


2 


3 


2 


(a) 


(b) 


3. The grey point in the following figure is a centerpoint of the point configuration of 
black points. Some of the separating lines have been shown: they have at least | of the 
points on each side. 



4. The line in the following figure is a ham-sandwich cut: it simultaneously bisects the 
black and the white points. 
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13.1.2 LINE AND HYPERPLANE ARRANGEMENTS 

Line arrangements and affine point configurations in the plane are related via polar- 
duality, a transformation which is better understood in terms of 3-dimensional vectors 
and central planes or using the projective and spherical models. Several types of com- 
binatorial data can be directly translated from the primal setting to the polar. As a 
consequence, theorems and algorithms on line arrangements follow directly from their 
counterparts on point configurations, and vice versa. This powerful tool has been used 
successfully in computational geometry for the design of efficient algorithms. It also gen- 
eralizes to higher dimensions, where hyperplane arrangements are polar-dual to point 
configurations in lZ d . 


Definitions: 

A ( Euclidean ) line in 7Z 2 is an affine subspace of dimension 1. A line is induced 
by a linear function l(x, y) = ax + by + c and any of its multiples of the form al. A 
line is oriented if a direction has been chosen for it. Its induced half-spaces are called 
half-planes and denoted by l + and l~ . 

A nonvertical line is a line given by an equation of the form y = ax + b, where a is 
called the slope and b the y-intercept of the line. It is oriented in increasing order of 
the ^-coordinates of its points and its induced half-planes are above/below it. 

A ( Euclidean ) line arrangement is a set C = { / | , . . . , l n } of planar lines, not all of 
them going through the same point; that is, fj" =1 /» = 0- If the lines are oriented, this 
is an arrangement of oriented lines. It is labeled if it is given as an ordered set. 

A line arrangement is in general position if no three lines have a point in common. 
An x-monotone curve is a curve intersecting each vertical line in exactly one point. 

A half-plane intersection is the planar region lying in the intersection of a finite set 
of half-planes. It is described by the (circular) list of the lines incident to its boundary. 

An upper envelope [lower envelope] of a nonvertical line arrangement is the half- 
plane intersection of the half-planes above [below] the lines of the arrangement. 

The k-level in a nonvertical arrangement of n lines (1 < k < n) is the lower boundary 
of the set of points in 7Z 2 having exactly k lines above and n — k below. 

The cell of a line arrangement is a connected component of the complement in 7 Z 2 
of the union of the points on the lines. 

The zone of a line l in an arrangement C (l ^ C) is the set of cells of the arrange- 
ment C intersected by l. 

A central plane arrangement in 7Z 3 is a finite set of central planes. The arrangement 
is oriented if the planes are oriented. 

An acyclic (or affine) central plane arrangement in 7Z 3 is a central plane arrange- 
ment such that there is a point in 7 Z 3 that lies on the positive side of all these planes. 

The ( standard ) line arrangement induced by a central plane arrangement is the 
arrangement of the lines of intersection of the central planes with an affine plane h 
in TZ 3 that is not parallel with any plane of the arrangement. If the central planes are 
oriented, an orientation is induced on the lines by keeping the positive side of a line 
within the positive side of the corresponding plane. 
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A big circle is the intersection of the unit sphere S 2 in TZ 3 with a central plane. If the 
plane is oriented, the circle is given an orientation so that the positive side of the plane 
lies on the left of the circle. 

A big-circle arrangement is the intersection of a central plane arrangement with the 
unit sphere S 2 in 1Z 3 . It is oriented if the planes are oriented. 

The big-circle arrangement induced by a central plane arrangement is the 

arrangement of the big circles of intersection of the central planes with the sphere S 2 
in 1Z 3 . It is oriented if the planes are oriented. 

The projective plane P 2 is the sphere S 2 in 7Z 3 with the antipodal points identified. 

A projective line is the projective curve induced by identifying the antipodal points 
of a big circle on S 2 . 

A projective line arrangement is a finite set of projective lines in the projective 
plane P 2 . 

The projective line arrangement induced by a central plane arrangement is 

the projective arrangement obtained by the antipodal point identification of the big 
circle arrangement on S 2 induced by the central plane arrangement. 

An arrangement graph is a graph associated with a Euclidean or projective line 
arrangement, or a big circle arrangement. Its vertices correspond to intersection points 
of lines (or circles) and its edges correspond to line (or arc) segments between two 
intersection points. [Note: For the Euclidean case, typically only the bounded line 
segments are considered as edges (but by adding extra dummy vertices “at infinity”, 
the infinite extremities of each line among the edges can be included) . For the Euclidean 
or spherical case, if the lines are oriented, the arrangement graph is directed, with the 
edges oriented to be compatible with the orientation of the lines or circles.] 

Isomorphic arrangements are arrangements having isomorphic arrangement graphs. 
(This applies to Euclidean lines, big-circles (oriented or not), and projective lines.) 

A polar-duality of vectors and central planes in 1Z 3 is a mapping V associating 
a vector v £ 7 Z 3 with an oriented central plane having v as its normal vector, and vice 
versa. 

A polar-duality of points and lines in the affine space 7 Z 2 is any mapping V asso- 
ciating a point p £ TZ 2 with an oriented line l in 7 Z 2 and vice versa, by the following 
general procedure: map the points to vectors via some imbedding of TZ 2 as an affine 
plane in TZ 3 , apply the polar-duality of vectors and central planes, and then intersect 
the polar central planes with some affine plane (identified with TZ 2 ) to get lines. 

A ( standard ) affine polar-duality is a mapping V between nonvertical lines and 
points in TZ 2 , associating the point (a, —6) with the line y = ax + b, and vice versa. 

A Euclidean hyperplane [central hyperplane] arrangement in TZ d is a finite set 
Ti = {hi, . . . ,h n } of affine hyperplanes [central hyperplanes], not all of them going 
through the same point. If the hyperplanes are oriented, the arrangement is oriented. 

The following are generalizations to an arbitrary affine space TZ d [vector space TZ d+l ] of 
previously defined concepts in affine dimension 2 [vector space TZ 3 ]: 

• arrangements of big (d-l)-spheres on S d generalize big-circle arrangements 

on S' 2 ; 

• projective arrangements of hyperplanes in P d generalize projective arrange- 

ments of lines in P 2 \ 
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• the polar-duality between vectors and central hyperplanes in lZ d+1 associates 

with a vector the hyperplane normal to it; 

• the face lattice of a hyperplane (central, affine, projective) or sphere arrange- 

ment, a data structure storing information on faces and their incidences, gen- 
eralizes the arrangement graph, and is used to define isomorphism of arrange- 
ments. 

• the k-level in an affine arrangement of nonvertical hyperplanes is the lower 

boundary of the set of points having exactly k hyperplanes above them. 


Facts: 

1. A bounded cell in a Euclidean line arrangement is a convex polygon. 

2 . The fc-level of a nonvertical line arrangement is an a:-monotone piecewise linear curve 
incident with vertices and lines of the arrangement. 

3 . The upper envelope is the 0-level of an arrangement of nonvertical lines. 

4 . In a simple big-circle arrangement, every pair of big circles intersect in exactly two 
points, which are antipodal on the sphere. 

5 . The arrangement graphs of planar line arrangements or spherical big-circle arrange- 
ments are planar imbedded graphs. The arrangement graph of a projective line ar- 
rangement is projective-planar. The faces or cells of these graphs are the connected 
components of the complement of the union of lines or circles. 

6. The association among central plane arrangements, big-circle arrangements, and 
projective arrangements preserves isomorphisms. The standard association of an affine 
line arrangement and big-circle arrangement to an acyclic plane arrangement preserves 
isomorphisms. 

7 . In a simple Euclidean arrangement of n lines, the number of vertices is (™), the 
number of segments (bounded or unbounded) is n 2 , and the number of cells (bounded 
or unbounded) is (”) + n+ 1. No nonsimple arrangement exceeds these values. 

8 . Zone theorem: The total number of edges (bounded or unbounded) in a zone of an 
arrangement of n lines is at most 6 n. 

9 . 'D(T>(y)) = p and T)(T>{h)) = h, for every vector v and hyperplane h. 

10 . Incidence preserving: If v £ h, then T)(h) £ 'D{v), for every vector v and hyper- 
plane h. 

11 . Orientation preserving: If v £ h + , then 'D(h) £ V(v) + , for every vector v and 
hyperplane h. 

12 . Basic properties of the standard affine polar-dual transformation: 

• polar-duality preserves above/below properties: if a point p is above line l, then 

the polar line V(p) is below the polar point V(l)\ 

• the polar-dual of a configuration of points P is an arrangement of nonvertical 

lines C = T>(P), and vice versa; 

• the polar-dual of a set of points given in increasing order of their ^-coordinates 

is a set of lines given in increasing order of their slopes; 

• the polar-dual of the set of points on the convex hull of P is the set of lines 

on the upper and lower envelopes of the polar arrangement ’D(P): convex hull 
computation dualizes to half-plane intersection; 
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• semi-spaces of P dualize to vertices, edges, and cells of the polar arrangement 

V{P)- 

• isomorphic arrangements of lines dualize to isomorphic configurations of points; 

• the polar-duals of lines PiPj inducing (fc— l)-sets and fc-sets in a point configuration 

P = {pi , . . . ,p„} are the vertices U fl lj on levels fc and n—k of the polar-dual 
arrangement L = {li,. 

13 . The polar-dual of an acyclic vector configuration in 1Z 3 is an acyclic central plane 
arrangement. 

14 . The upper envelope of a line arrangement can be computed optimally in 0(n log n) 
time. 

15 . The arrangement graph of a line arrangement can be computed in 0(n 2 ) time and 
space. 

16 . The face incidence lattice of a hyperplane arrangement in lZ d can be computed 
in 0(n d ) time and space. 

17 . The standard polar-duality is ubiquitously used in computational geometry. For 
example, it is used to derive algorithms for half-plane intersection from convex hull 
algorithms, to translate between line slope and point ^-coordinate selection, and to 
compute the visibility graph in 0(n 2 ) time using the polar-dual arrangement graph. 
See [Ed87] (Chapter 12) for a collection of such problems. 

18 . In computational geometry, fc-levels are related to the furthest fc-neighbors Voronoi 
diagrams via a lifting transformation that reduces the computation of dimension d 
Voronoi diagrams to dimension d + 1 arrangements of hyperplanes. 

19 . fc-levels in arrangements, as polar-duals to fc-sets, have an abundance of applica- 
tions in statistics. 

Examples: 

1 . The following figure shows a line arrangement in general position. The arrangement 
graph has 10 vertices corresponding to the black points (which could be labeled with 
pairs of indices of lines such as 12,13, etc.), 15 edges corresponding to the bounded 
line segments such as (12, 13), and 2 x 5 = 10 unbounded edges. The upper envelope 
is bounded from below by the 0-level, whose list of lines is {1,2, 3, 5}. The dashed 
piecewise linear curve is the 2-level. 
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2. The zone of line 3 in the line arrangement {1, 2, 4, 5} from the figure of Example 1 
is depicted in the following figure. It has 5 cells (1 bounded, 4 unbounded), whose 
boundaries sum up to 12 segments. 



3. The line arrangements in the figure for Example 1 and part (a) of the following figure 
are isomorphic. Those in parts (a) and (b) of the following figure are not isomorphic. 




(a) (b) 

4. The following figure illustrates the standard polar-duality. The arrangement is polar- 
dual to the configuration of points in the figure of §13.1.1 Example 1; hence the lines are 
given by the equations 1: y = 0, 2: y = — x — 1, 3: y = 3a;, 4: y = x — 1 and 5: y = 2a; — 2. 
In the primal configuration, point 2 is above line 13. In the polar-dual, line 2 is below 
the intersection point of lines 1 and 3. 
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13.1.3 PSEUDOLINE ARRANGEMENTS 

Pseudoline arrangements represent a natural generalization of line arrangements, retain- 
ing incidence and orientation properties, but not straightness. They provide a topolog- 
ical representation for rank 3 oriented matroids (see §13.1.4), which in turn abstract 
combinatorial properties of vector configurations and oriented line arrangements. 

Definitions: 

A pseudoline is a planar curve (which may be given an orientation). It is x-monotone 
if the curve is x-monotone. 

A pseudoline arrangement is a finite collection of simple planar curves that intersect 
pairwise in exactly one point, where they cross, and not all of which have a point in 
common. It is labeled if the pseudolines are given in a fixed order {I i,...,Z„}, and 
oriented if the pseudolines are oriented. 

A pseudoline arrangement is realizable or stretchable if it is isomorphic to a line 
arrangement. 

Note: The following terms, defined in §13.1.1 and §13.1.2 for line arrangements, have 
straightforward generalizations to pseudoline arrangements: open half-planes, general 
position, cell, arrangement graph, isomorphism, upper/lower envelope, fc-level, zone. 

Facts: 

1. Every arrangement of pseudolines is isomorphic to an arrangement of x-monotone 
piecewise linear pseudolines (a wiring diagram as in the figure of Example 2). 

2 . The arrangement graph of a pseudoline arrangement in general position has (™) 
vertices, n 2 edges, and (™) + n + 1 faces. 

3 . The number of edges in a zone of a pseudoline in a pseudoline arrangement is at 
most 6n. 

4 . Let efc(n) be the number of edges on the fc-level of a pseudoline arrangement. Then 
O(nlogfc) < efc(n) < 0(nks). 

5 . The logarithm of the number of isomorphism classes of pseudoline arrangements 
is 0(n 2 ). The same number for line arrangements is 0(nlogn). 

6. There exist nonstretchable pseudoline arrangements. 

7 . It is NP-harcl to decide whether a pseudoline arrangement is stretchable. 

8. Pseudoline stretchability is decidable in PSPACE. 

9 . Assume that a predicate is given for deciding when the intersection point of two 
pseudolines is above or below a third pseudoline. Then the algorithms for computing the 
upper envelopes, half-space intersection, or the arrangement graph of a line arrangement 
can be adapted to work for pseudoline arrangements. 

10 . In computational geometry, some algorithmic solutions can be found by reducing 
the problem to structures behaving like pseudolines — for example, computing the 
boundary of a union or intersection of unit circles in 1Z 2 , all having at least one point 
in common. 

11 . It is an open problem whether better algorithms can be devised by making explicit 
use of the straightness of the lines in geometric problems. So far, the only problem 
where an explicit gap in efficiency between lines and pseudolines has been displayed is 
in the number of comparisons needed to sort the x-coordinates of the vertices of line 
versus x-monotone pseudoline arrangements. 
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Examples: 

1. The pseudoline arrangement in this figure is stretchable, because it is isomorphic to 
the line arrangement in the figure of §13.1.2, Example 1. 



2 . The following figure shows a standard way of representing a pseudoline arrangement 
as an x-monotone piecewise linear curve arrangement called a wiring diagram. The 
arrangement is the same as in the figure of Example 1. 



3 . A nonstretchable pseudoline arrangement: The theorem of Pappus in plane geom- 
etry states that if the points 1, 2, 3 and 4, 5, 6 are collinear and in this order on two 
lines, then the three intersection points of the pairs of lines 7 = (15,24), 8 = (16,34), 
and 9 = (26,35) are also collinear. See the following figure. The perturbed arrange- 
ment obtained by replacing the line through 7,8,9 with the dashed pseudoline is not 
stretchable, since it violates the theorem of Pappus. 
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13.1.4 ORIENTED MATROIDS 


General oriented matroids are abstractions of vector configurations and oriented hy- 
perplane arrangements. Affine oriented matroids model the corresponding situation in 
affine spaces. They capture in various types of data structures (semi-spaces, chirotopes, 
Radon partitions, arrangement graphs, hyperline sequences) combinatorial information 
about n-element configurations. This forms the basis of the classification of all n-point 
sets into a finite number of equivalence classes. Each data structure satisfies a set of 
simple properties, or axioms, which characterize a wider class of objects collectively 
referred to as oriented matroids. 

To simplify the exposition, in some cases only the axiomatization corresponding to 
points in general position will be presented. Not all oriented matroids arise geometrically 
from vector sets, but they do have a topological representation via pseudohyperplane 
arrangements. In rank 3, affine oriented matroids are modeled by pseudoline arrange- 
ments. Many geometric algorithms working with line arrangements or, by polarity, 
point configurations, make use of no more than oriented matroicl properties and can be 
extended to pseudolines. 

As potential applications, oriented matroids lay the foundations for a rigorous the- 
ory of geometric program verification and testing. 

Definitions: 

Notation: 

E n = { 1, . . . , n} and E n = { 1, . . . ,n} U {T, . . . ,n}. 

3 

Triplets ( i,j,k ) € E n are denoted ijk. 

A signed set X = (X + , X~) is a partition of a finite set X into a positive part X + 
and a negative part X~ . That is, X = X + U X~ and X + fl X~ = 0. In E n a signed 
set may be denoted as a signed sequence of indices, such as 1234 for ({1, 3, 4}, {2}). 

The complement of a signed set X = (X + , X~) is the set —X = {X~ ,X + ). 

The support of a signed set X = (X + , X~) is the unsigned set X. 

The size of a signed set X = (X + ,X“) is the size of the support of X. 

The signed double covering of a finite set X is the set X = X + U X~, where 
X + = X and X~ = (S | i € 1} is a signed distinct copy of X (its elements called 
negated elements), X + fl X~ =0. If a: € X ~ , then x is the corresponding nonnegated 
element in X + . 

A basis of a vector configuration [point configuration] V C lZ d is a subset of V, identified 
by a d-set of indices, which is a basis [simplex] of the ambient space lZ d . A signed basis 
is an ordered basis together with its sign. 

The chirotope of an ordered vector configuration [point configuration] is the set of all 
signed bases of V. 

An alternating function is a function /: E d — > R such that the sign of f(ii, . . . , id) 
is preserved under an even permutation and negated under an odd permutation of the 
d-tuple (ii, . . . ,i d ). 

An antisymmetric function is a function /: E ^ — > 1Z such that its sign changes when 
one of the parameters is negated. [For example, f{i\, in, • ■ . , id) = — /(*i , * 2 , ...,*<*).] 
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The ( rank 3) Grassmann-Pliicker relations are the identities 
[1 2 3] [1 4 5] - [1 2 4] [1 3 5] + [1 2 5] [1 3 4] = 0 

satisfied by the determinants [ij k] = det(u*, Vj,Vk), for any five vectors Vi, 1 < i < 5. 
The ( rank d) Grassmann-Pliicker relations are the identities 
[h ■ ■ ■ id- 21 2] [ii . . . id- 234] - [h ■ ■ ■ id- 21 3 ] [h . . . id- 2 2 4] + 

[h ■ ■ ■ id- 21 4] [*i . . . id- 2‘2 3] = 0 

satisfied by the determinants [ii . . . id- 2 j k] = det (v^, . . . ,Vi d 2 ,Vj,Vk), for any d + 2 
vectors Vi j (1 < j < d— 2) and Vi (1 < i < 4). 

Chirotope axioms (rank d): A function y: E^ — > {—1, 0, +1} is a chirotope of rank d 
if it satisfies the following conditions: 

• y is alternating and antisymmetric; 

• for any d + 2 generic points i\ . . . id- 2 I 2 3 4, the signs y© ■ ■ • k) of the six 

triplets involved in the Grassmann-Pliicker relations are such that equality is 
possible. 

A uniform chirotope is a chirotope function that takes nonzero values on all d-tuples. 

3 

Chirotope axioms ( uniform , rank 3): A function y:A r) — > { — 1, +1} is a uniform 
chirotope of rank 3 if it satisfies the following conditions: 

• x is alternating and antisymmetric; 

• for any 5 generic points a,b,i,j,k, if y(a&?') = y(a6j) = y(a6fc) = +1 and 

x( a ij) = x( a J k) = +1, then y(ai/c) = +1. 

The chirotope y is affine if, in addition, it satisfies the axiom: 

• for any four points i,j,k,l , if y (i j k) = y (i kl) = y (i l j ) = +1, then y (j k l) — +1. 

An oriented matroid given by a chirotope is an abstract set of points labeled 
{1, . . . , n}, together with a function y satisfying the chirotope axioms. 

The circuit of a set of labeled vectors V = {i>i , . . . , v n } is the signed set C = (C + , C~), 
where C + = {j \ 01 j > 0} and C~ = { j \ ctj < 0 }, of indices of the nonnull coeffi- 
cients cij in a minimal linear dependency V' = [v,^ , ... ,Vi k } of V with a j v ij = Q. 
If C is a circuit, its complement — C is also a circuit. 

A Radon partition of a set of labeled points P is a signed set C = (C + , C~) of points 
of P such that the convex hull of the points in C + intersects the convex hull of the 
points in C~ . 

A minimal Radon partition (or circuit) is a Radon partition whose support is 
minimal with respect to set inclusion. 

An oriented matroid of an ordered vector [point] configuration given by its circuits 
is the set of all circuits of V. 

A set C of signed subsets of E n satisfies the circuit axioms if: 

• 0 /e<? 

• if C £ C then — C € C; 

• (minimality): if C = (C + , C~ ) is a circuit, then no subset of the support of C is 

the support of another circuit; 

• (exchange): if C\ and C 2 are two circuits such that Ci ^ —C 2 and e € G^ 1 " nCf , 

then there exists another circuit D such that e /gD, D + C C + U , and 
D~ C Ci U C 2 ■ 
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The oriented matroid given by its circuits is an abstract set E n together with a 
set of signed sets C satisfying the circuit axioms. 

A cocircuit of a labeled vector configuration V is a signed set C = ( C + ,C~ ) of E n , 
induced by a subset of d — 1 vectors spanning a central hyperplane h. For an arbitrary 
orientation of h, C + is the set of indices of elements in V lying in h + and C~ is the set 
of indices of elements in V lying in h~ . 

The cocircuit axioms are the conditions obtained from the circuit axioms by replacing 
“circuit” with “cocircuit”. 

An oriented matroid given by its cocircuits is an abstract set E n , together with 
a set of signed sets C satisfying the cocircuit axioms. 

A circular sequence of period k is a doubly infinite sequence (qi)i(-z with qy = qi+k 
for all i € Z. 

A signed permutation of a set S' is a permutation of S whose elements are also 
assigned a sign; for example, 134 2, where 3 is negative and 1, 2, and 4 are positive. 

An alternating circular sequence is a circular sequence (qi)i£z with half-period k, 
defined with elements from a signed double covering g, £ X and satisfying the property 
q., = q i+k for all i £ Z. 

A representation of an alternating circular sequence can be obtained by any of its 
subsequences of k consecutive elements (half period) {q\, ... , q k }. 

A star (or rank 3 hyperline sequence ) associated with a point pi £ P C 1Z 2 [vector 
Vi € V C 1Z 3 ] is an alternating circular sequence of subsets of indices in E n obtained 
by rotating an oriented line [oriented central plane] in counterclockwise order around pt 
[the line through vector ty] and recording the successive positions where it coincides with 
lines [central planes] defined by pairs of points ( Pi,Pj ) with pj £ P\{Pi} [vectors ( Vi,Vj ), 
with Vj £ V \ {ty}]. If a point pj is encountered by the rotating line in the positive 
direction from Pi , it will be recorded as a positive index, otherwise it will be recorded 
as a negative index. When the points are not in general position, several may become 
simultaneously collinear with the rotating line, and they are recorded as one subset L*. 
The sequence is denoted by a half-period si = (L\,L 2 , . . . , L l k .), where L l - C E n \ {i, i}. 

A cluster of stars (or rank 3 hyperline sequences ) associated with a point (or 
vector ) configuration P is the ordered set of n stars Si, . . . , s n , one for each point 
Pi G P- 

A uniform cluster of stars is a cluster of stars corresponding to a set of points in 
general position. (Each star is a sequence of individual indices.) 

An oriented matroid of a vector (or point) set V given by its cluster of stars 

is the cluster of stars associated with V. 

A star (or rank 3 hyperline sequence ) associated with an element Ci of a 
big-circle arrangement C = {ci,...,c„} on S 2 is an alternating circular sequence 
of subsets of indices in E n obtained by traversing the oriented big-circle in its given 
direction and recording in order the intersections of Ci with the other big-circles Cj 
(j yf i). Each intersection is recorded as a signed index j: positive if Cj crosses Ci from 
left to right, negative otherwise. 

The cluster of stars (or rank 3 hyperline sequences ) associated with a big- 
circle arrangement is the set of n stars s i, . . . , s n , one for each circle c, G C. 
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The cluster of stars associated with an oriented central plane arrangement 
in TZ 3 [line arrangement in TZ 2 ] is the cluster of stars of the big-circle arrangement 
associated with the central plane arrangement [to the central plane arrangement induced 
by the line arrangement via the imbedding of TZ 2 as the plane z = 1 in 1Z 3 } . 

The cluster of stars associated with a pseudoline arrangement [a pseudocon- 
figuration of points ] is the generalization from straight lines to pseudolines obtained 
by recording the order of the vertices of the arrangement along a pseudoline (positive 
or negative according to whether the line crossing at that vertex comes from right or 
left) [the circular counterclockwise order of the pseudolines incident with a point]. 

A cluster of star permutations is an ordered set of alternating circular sequences 
Si, . . . ,s n with the property that the representative half-period of sequence Sj is a signed 
permutation of the set E n \ {i}. 

A chirotope function associated with a set of cluster of stars permutations 

is a function X'-^n { — 1, +1} defined by y(ij/c) = +1 if, in the ith se- 
quence Si and in a half period of it where both j and k occur positively, j occurs 
before k. Otherwise y(i j k) = —1. 

A set E n , together with an ordered set of alternating circular sequences s\,...,s n , 
satisfies the cluster of stars axioms ( uniform , rank 3) if the set of sequences are 
cluster of star permutations whose associated chirotope function is alternating. 

A (uniform, rank 3) oriented matroid given by its cluster of stars is a set E n 

together with n alternating sequences satisfying the cluster of stars axioms. 

An abstract set E n , together with a set of n d ~ 2 (uniform) alternating sequences (in- 
dexed by (d 2)-tuples (A, . . . , 1 , 1 - 2 ))- is an oriented matroid given by its hyper- 
line sequences (uniform, rank d ) if the chirotope function X : ^n * {— 1,+1} 
associated with it is alternating. [The function y: E d — > { — 1, +1} is defined by 
x(ii ■ ■ ■ id- 2 j k) = +1 if in the star indexed by i 1 , . . . , id -2 and in a half period where 
both j and k occur positively, j occurs before k. Otherwise y(*i . . . id- 2 j k) = — 1 .] 

Deletion is the removal of a point (vector, line, etc.) from a configuration and recording 
the oriented matroid (chirotope, circuits, cluster of stars, etc.) only for the remaining 
points. 

In a rank d central plane arrangement, the contraction on element i is obtained by 
identifying hi with 7 Z d ~ 3 and intersecting it with all the other hyperplanes to obtain a 
rank d— 1 arrangement with one element less. 

The oriented matroid obtained by a one-element deletion in the hypersequence 
representation is the matroid obtained by removing the element from all the hyperline 
sequences of the original oriented matroid, and discarding all hyperline sequences whose 
labels contain that element. 

The oriented matroid obtained by a one-element contraction in the hyperse- 
quence representation is the matroid obtained by retaining only the hyperline sequences 
whose labels contain the element, and dropping it from the labels. 

A rank 2 contraction of a cluster of stars is one of the stars. 

A minor of an oriented matroid given by hyperline sequences is an oriented matroid 
obtained by a sequence of deletions and/or contractions. 
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A ( Euclidean ) pseudoconfiguration of points is a pair (P, L) with P = {p \ . . . . ,p n } 
a planar set of points and L = {li, ■ - ■ ,l m } a pseudoline arrangement, such that for 
every pair of distinct points (j>i,Pj) there exists a unique pseudoline Uj € L incident 
with them. 

If a pseudoline arrangement is intersected with a vertical line l v ( x = — M), for some 
very large constant M and all the vertices of the arrangement lie to the right of l v , then 
the order in which the pseudolines in L cross Vh (decreasing by the ^-coordinates of the 
crossings) is the (increasing) slope order of the pseudolines. 

The ( standard affine) pseudo polar-duality is the association between an x-inono- 
tone pseudoline arrangement L = {li, . . . ,l n } given in slope order and a pseudo configu- 
ration of points (P, L'), P = {pi, . . . ,p n }, given in increasing order of the ^-coordinates 
and with L' being x-monotone, satisfying the property that the cluster of stars associ- 
ated with L and to P are the same. 

Facts: 

1. Cocircuits correspond to semi-spaces, when the defining hyperplane is incident 
with d—1 independent elements of the configuration. 

2. In the rank d uniform oriented matroid associated with a vector configuration in 
general position in lZ d , all the d-tuples are bases, all the (d+l)-tuples are supports of 
circuits, and all the (n— d+l)-tuples are supports of cocircuits. 

3. The oriented matroid associated with an affine vector configuration V and the affine 
oriented matroid associated with the affine point configuration induced by V are the 
same. 

4. The chirotope function \ associated with the cluster of stars of a vector or point 
configuration is an alternating and antisymmetric function. 

5. The two given systems of chirotope axioms are equivalent (for the uniform case). 

6. The hyperline sequences of a contraction by one element of a central plane arrange- 
ment are the induced rank (d—1) contraction by that element of the set of rank d 
hyperline sequences of the original arrangement. 

7. The induced rank (d—1) contraction of a set of rank d hyperline sequences is a rank 
(d—1) set of hyperline sequences. 

8. A minor of an oriented matroid (given by its hyper line sequences) is an oriented 
matroid. 

9. For two labeled vector (or point) configurations V\ and V 2 , the following statements 
are equivalent: 

• Vi and V -2 have the same chirotope; 

• Vi and V 2 have the same order type; 

• Vi and V 2 have the same hyperline sequences; 

• Vi and V 2 have the same minors. 

Moreover, for any reorientation of V) or V 2 : 

• Vi and V 2 have the same oriented matroid given by circuits; 

• Vi and V 2 have the same oriented matroid given by cocircuits. 

This justifies the unique name oriented matroid for the equivalence class of vector con- 
figurations with the same chirotope (or clusters, order type, etc.), and for a reorientation 
class of an oriented matroid. 
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10. For a labeled vector configuration V in TZ d+1 , the following statements are equiv- 
alent: 

• V is acyclic (affine); 

• there is a labeled point configuration P in 7Z d whose oriented matroid (or order 

type or chirotope) is the same as the oriented matroid of V. 

11. Multiplying the elements of a vector configuration by a positive scalar yields a 
vector configuration with the same oriented matroid. In particular, for any vector 
configuration V in 7Z d+1 , there exists an equivalent vector configuration on the d-sphere. 

12. For any vector configuration, there exists a reorientation of some of its vectors 
which makes it affine. 

13. The set of circuits of an affine (acyclic) vector configuration does not contain the 
positive cycle (E n ,0). 

14. Two projective line arrangements with the same arrangement graph have polar- 
dual configurations of points in the same reorientation class (and this can be generalized 
to arbitrary dimension d). 

15. If a labeled vector configuration in TZ 3 and a labeled oriented arrangement of central 
planes are polar-dual, then they have the same hyperline sequences (and this can be 
generalized to arbitrary dimension d). 

16. The number of oriented matroids of rank d and size n is 2°( n \ 

17. The number of realizable oriented matroids of rank d and size n is 2°(" logr d. 

18. Folkman-Lawrence topological representation theorem : Every oriented matroid of 
rank d can be represented as a (d— l)-pseudosphere arrangement on S d . 

19. Every affine oriented matroid of rank 3 can be represented as a pseudoline arrange- 
ment and as a (polar-dual) pseudoconfiguration of points. 

20. There exist nonrealizable oriented matroids of any rank. 

21. Realizable oriented matroids cannot be characterized by a finite set of excluded 
minors. 


Examples: 

1. The function xihji k) = sign clet (vi,Vj,Vk), for vi £ V = {iq, C TZ 3 is 

alternating antisymmetric. 

2. The following figure shows an example of a point configuration in general position, 
together with the connecting lines. 
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Its oriented matroid is given by: 

• chirotope: 123—, 124—, 125+, 134+, 135+, 234+, 235+, 345+. The other 

signed triplets are computed by antisymmetry and alternation. 

• minimal Radon partitions: 123 4, 123 5, 1245, 134 5, 2345 and their comple- 

ments. 

• cocircuits: 345, 245, 235, 234, 145, 135, 134, 125, 124, 123 and their 

complements. 

• cluster of stars: 1:34 2 5, 2:15 34, 3:145 2, 4: 5 2 1 3, and 5: 1 2 34. (An arbi- 

trary half-period was chosen for the circular sequences.) 

3. The oriented matroid given as a cluster of stars for the line arrangement in the 
figure of §13.1.2 Example 1 (or for the pseudoline arrangement in the figure of §13.1.3 
Example 1) is: 

1:2354, 2: T3 5 4, 3:1254, 4:5l23, 5:4l23. 

The orientation of the lines is assumed to be in increasing order of the x-coordinates. Its 
minor by the deletion of element 3 is: 1: 2 5 4, 2: 1 5 4, 4: 5 1 2, and 5: 4 1 2. Its contraction 
on point 3 is 12 5 4. 


13.2 SPACEFILLING 


13.2.1 PACKING 

The central notion in the theory of packing is the density of a packing. 


Definitions: 


A convex body in d-dimensional Euclidean space lZ d is a compact convex subset of lZ d 
with nonempty interior. 

A family V = {K\, K 2 , . . .} of the convex bodies K\, K 2 , ... in 1Z d forms a packing 
of lZ d if no two of the convex bodies Ki, K 2 , ■ ■ . have an interior point in common. 

Let V = {ATi, K 2 , . . .} be a packing of 1Z d and Br the closed ball of radius R centered 
at the origin in lZ d . The lower density and upper density of V are defined by 


S(V) = liminf 

R ^+ 00 


X) KjCB R V°1 (Kj) 

Vol(Bfl) 


and S(V) 


lim sup 

R — »+oo 


^KiCB R Vol(-^i) 
Vol (Br) 


If 5{V) = S( V) = S(V), then S(V) is called the density of V. 

For a convex body K C 1Z d , let S(K) denote the largest (upper) density of packings by 
congruent copies of K in lZ d . In particular: 

6t(K) = the largest (upper) density of packings by translates of K in 7 Z d ; 

Sl{K) = the largest (upper) density of packings by lattice translates of K in lZ d . 

Let V = {Bf,B d ,...} be a packing of unit balls in lZ d with centers c±,c 2 , The 

Voronoi cell with center Ci is the convex polyhedral set Vi = { x € lZ d | \x — Cj| = 
min,- \x — Cj \ }. 
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Facts: 

1. There are two major problems concerning density: 

• Given a convex body K C 7Z d , find efficient packings with congruent copies of K\ 

i.e., find packings with congruent copies of K in 7Z d having “relatively high” 
density. 

• Find a “good” upper bound for 6(K). 


2 . Sphere packing in n-dimensional space, for n large, is important in designing codes 
that are efficient and unlikely to contain errors when data is transmitted. 

2 ((d) 


3. If if is a convex body in 7Z d 


O 


< Sl(K) < St(K) < 5(K), where ((d) = 


1 + + gv + ' ' ' denotes Riemann’s zeta function. 


4. If I\ is a centrally symmetric convex body in TZ d , M < Sl(K) < 6r(K) < 6(K). 

5 . Facts 3 and 4 have been improved slightly for different classes of convex bodies 
and subclasses of centrally symmetric convex bodies. (See [DaRo47], [E10dRu91], and 
[Sc63].) 

6. For each d, ( d - l)4r4 < 6 L (B d ) < 6 T (B d ) = 6(B d ). 

2 d 1 

7 . For every convex domain D, 6l(D) = St(D). 

8. For every centrally symmetric convex domain D, 6l(D) = St(D) = 6(D). 

9. There exists an ellipsoid C in 7Z 3 for which 6l(C) < 6(C). 

10. The class of convex bodies C C 7 Z d for which 6l(C), 6t(C), and 6(C) can be 
determined (for a given d) is very small. 


11 . It is possible to extend some of the above theorems to spherical as well as hyperbolic 
spaces. (In short, a space of constant curvature means the corresponding Euclidean 
space or spherical space or hyperbolic space.) For example, see Fact 15. 

12 . Let V = {Bf , B d , . . .} be a packing of lZ d by unit balls with centers Ci, C2, For 

a regular simplex of edge length 2 in 7 Z d with a unit ball drawn around each vertex, 
let ad be the ratio of the volume of the portion of the simplex covered by balls to the 
volume of the whole simplex. For each Voronoi cell V) (i = 1,2,.. .) with center c,, let 

Vi = { x € Vi I \x — cd < \/,, 2o! -i \. Then ^ < a d for all * = 1,2,..., and hence 

l 1 \ d +i ) Vol(Ei) 


6 (V) < a d . 


Note: This result is sharp for d = 1 and 2, but has been improved for sufficiently large d 
and for d = 3 in Facts 13 and 14. 

13 - J ( Bd ) g 2 ( 0 . 599 + O (l))d as d — ^ 00. 

14 . 6(B 3 ) < 0.7731. 

15 . The densest lattice packing of unit balls is determined up to dimension 8. The 
following table lists the optimal lattices. (See [CoS193].) 


dimension 

12345678 

densest lattice packing 

Z A 2 A 3 D 4 Z ?5 Eq E'j E s 


16 . Given a set of n (n > 3) nonoverlapping circles of radius r in a plane of constant 
curvature, the density of the circles with respect to the outer parallel domain of the 
convex hull of their centers at distance r is less than = 0.90689 .... 
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17. The densest packing of circles of radius r in the plane is a hexagonal arrangement, 
with each circle tangent to its six neighbors. 

18. No packing of spheres in 7Z 3 can fill more than 78 percent of the possible volume. 

19. A face-centering packing of spheres in TZ 3 fills slightly more than 74 percent of the 
total possible volume. (A face-centering packing of spheres consists of first arranging a 
layer of spheres so each is tangent to six neighbors. A second layer is added by placing 
spheres in the depressions that occur in any triangle formed by three adjacent spheres 
in the first layer. This pattern continues, giving a configuration much like a pyramid of 
oranges seen in a supermarket produce display.) 

20. For numerous other important concepts and results in the field, see [FeKu93]. 

Open Questions and Conjectures: 

1. The major outstanding question is to estimate 8i,{B d ) and ST(B d ) = 8(B d ) for the 
d-dimensional unit ball B d in 7Z d . 

2. Kepler’s conjecture: S(B 3 ) < -^= = 0.74048 

3. The dodecahedral conjecture: The volume of any Voronoi cell in a packing of unit 
balls in E 3 is at least as large as the volume of a regular dodecahedron with inradius 1. 

4. It is widely believed, but not proved, that 5l{B 3 ) = S(B 3 ) and 8 L(B d ) < S(B d ) for 
sufficiently large d. 


13.2.2 COVERING 


Definitions: 

A family C = {Ki \ i £ I } of convex bodies in 7 Z d forms a covering of 7 Z d (that is, 
covers 7 Z d ) if each point of lZ d belongs to at least one convex body of C. 


The lower density and upper density of a covering C are 


u(C) = liminf 

fl^+oo 


Vol(A'j) 


and 


Volfi)*) — F(C)=1 rr» P Vol(B*) 

where Br denotes the closed ball of radius R centered at the origin in 7 Z d . 


Y^KiC[B R ^ Vol(A'j) 


If v{C) = v{C) = u(C), then v(C.) is called the density of C. 

For a convex body K C 7 Z d , let 

v{K) = the smallest (lower) density of coverings of lZ d by congruent copies of K\ 
vt(K) = the smallest (lower) density of coverings of 7 Z d by translates of AT; 
vl(K) = the smallest (lower) density of coverings of 7 Z d by lattice translates of K. 


Facts: 

1. There are two major problems concerning covering: 

• Given a convex body K C 7 Z d , find efficient coverings of lZ d with congruent copies 

of A'; that is, find coverings of 7 Z d by congruent copies of K having “relatively 
small” density. 

• Find a “good” lower bound for v(K). (This is a highly nontrivial task for most 

of the convex bodies K C 7 Z d .) 


© 2000 by CRC Press LLC 



2. If I\ is a convex body in 7Z d , then 

v(K) < vt{K) < d(lnd) + d(lnlnd) + 4 d 

and 

Ut(K ) < vl(K) < ddog 2 iog 2 d)+o f or some constant c. 

3. If B d denotes the d-dimensional closed unit ball in 7 Z d , then 

v(B d ) = v T (B d ) < v L (B d ) < cd(lnd)5 Io « 2 ( 27re ) 

for some constant c. 

4 . Take a regular simplex inscribed in a unit ball in 7Z d and draw unit balls around 
each vertex. Let Td be the ratio of the sum of the volumes of the portions of these balls 
lying in the regular simplex to the volume of the regular simplex. Then Td < v(B d ). 

5. Td ~ ~ij 2 - (Thus, Facts 2 and 3 give strong estimates for v(B d ). Moreover, for d = 1 
and 2, the lower bound Td is sharp.) 

6 . The thinnest lattice covering of lZ d by unit balls has been determined up to dimen- 
sion 5 only. The following table lists the optimal lattices. (See [CoS193].) 


dimension 

1 

2 

3 

4 

5 

thinnest lattice covering 

Z 

A-2 

^3 

K 



13.2.3 TILING 

Only a “diagonal” view of the basic definitions and theorems of this area will be given. 
See [GrSh86] for additional material. 

Definitions: 

A tiling T of Euclidean d-space lZ d is a countable family of closed topological d-cells 
of 1Z'\ the tiles of T, which cover lZ d without gaps and overlaps. 

A monohedral tiling is a tiling T of lZ d in which all tiles are congruent to one fixed 
tile T, the (metrical) prototile of T. In this case, T admits the tiling T. 

A regular polygon is a polygon with all sides congruent and all interior angles equal. 

A regular tiling is a monohedral tiling of the plane (JZ 2 ) with a regular polygon as 
prototile. 

A semiregular tiling is a tiling of the plane using n prototiles with the same numbers 
of polygons around each vertex. 

A semiregular polyhedron is a convex polyhedron with each face a regular polygon, 
but where more than one regular polygon can be used as a face. 

A tiling T of 7 Z d by convex polytopes is normal if there exist positive real numbers r 
and R such that each tile contains a Euclidean ball of radius r and is contained in a 
Euclidean ball of radius R. 

A face-to-face tiling is a tiling T by convex d-polytopes such that the intersection of 
any two tiles is a face of each tile, possibly the (improper) empty face. When d = 2, 
such a tiling is an edge-to-edge tiling. 
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A lattice tiling, with lattice L, is a tiling T by translates of a single tile T such that 
the corresponding translation vectors form a d-dimensional lattice L in 7Z d . 

A Euclidean motion a of 7Z d is a symmetry of a tiling T if a maps (each tile of) T 
onto (a tile of) T. The set of all symmetries of T (a group under composition) is the 
symmetry group S(T) of T. 

A periodic tiling is a tiling T of TZ d such that S(T) contains translations in d linearly 
independent directions. A tiling T is nonperiodic if S(T) contains no translation 
other than the identity. 

An isohedral tiling is a tiling T such that S(T) acts transitively on the tiles of T. 
An isogonal tiling is a tiling T such that S(T) acts transitively on the vertices of T. 
An isotoxal tiling is a tiling T such that S(T) acts transitively on the edges of T. 

Let T and T' be tilings of lZ d with symmetry groups S(T) and S(T'). Let lZ d — > lZ d 
be a homeomorphism that maps T onto T' . <l> compatible with a symmetry a of T 
if there exists a symmetry o' of T' such that ct' ( f> = 'her. <L is compatible with S(T) 
if <h is compatible with each a in S(T). The tilings T and T' of 7Z d are homeomeric, 
or of the same homeomeric type, if there exists a homeomorphism <L: lZ d — > lZ d that 
maps T onto T' such that <L is compatible with S(T) and is compatible with S(T'). 

A prototile T in lZ d is aperiodic if T admits a tiling of lZ d , yet all such tilings are 
nonperiodic. In general, a set S of prototiles in 1Z d is said to be aperiodic if S admits 
a tiling of lZ d , yet all such tilings are nonperiodic. 

Facts: 

1. There are three monohedral edge-to-edge tilings of 1Z 2 with regular polygons; the 
prototile must be a triangle, a square, or a hexagon. See the following figure. 



2. There are eight semiregular tilings of 1Z 2 . These tilings use two or three prototiles. 

3. Shapes that are not regular polygons [polyhedra] can be used in monohedral tilings 
of H 2 [1Z 3 ]. 

4. Any triangle can be used in a monohedral tiling of the plane. (Join two to form 
a parallelogram and tile a strip using these parallelograms. Repeat this process with 
parallel strips to tile the plane.) See the following figure. 
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5 . Any quadrilateral can be used in a monohedral tiling of the plane. See the following 
figure. (Take a second copy of the quadrilateral and rotate it 180 degrees. Join the two 
to form a hexagon. Use the hexagons to tile the plane.) 




6. Any pentagon with a pair of parallel sides can be used in a monohedral tiling of the 
plane. 

7 . There are at least fourteen types of convex pentagons that can be used in a mono- 
hedral tiling of the plane. It is not known if there are more. 

8. There are three types of convex hexagons that can be used in a monohedral tiling 
of the plane. Assume that the hexagon has vertices a, b , c, d. e, f in clockwise order. See 
the following figure. The prototile must be of one of the following forms: 

• sum of angles at a, b , c is 360°; length of {a, /} = length of {c, d}; 

• sum of angles at a, b, e is 360°; length of {a,/} = length of {d, e} and length of 

{b, c} = length of {e, /}; 

• angles at a, b, and c are each equal to 120°; length of {a, b} = length of {a,/}, 

length of {c, b} = length of {c, d}, and length of {e, d} = length of {e, /}. 



9 . No convex polygon with more than six sides can be used as prototile in a monohedral 
tiling of 1Z 2 . 

10 . Of the five regular polyhedra (tetrahedron, hexahedron (cube), octahedron, dodec- 
ahedron, icosahedron), only the tetrahedron and cube can be used as a prototile in a 
regular tiling of 7Z 3 . 

11 . If T is a tiling of 7 Z d with convex tiles, then each tile in T is a convex d-polyhedron. 

12. If T is a tiling of TZ d with compact convex tiles, then each tile in T is a convex 
d-polytope. 

13 . The following classification results have a long history. (See [GrSh86].) 

• There exist precisely 11 distance edge-to-edge isogonal plane tilings, the tiles of 

which are convex regular polygons (called Archimedean tilings). 

• There exist precisely 81 homeomeric types of normal isohedral plane tilings. Pre- 

cisely 47 of these can be realized by a normal isohedral edge-to-edge tiling with 
convex polygonal tiles. 
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• There exist precisely 91 homeomeric types of normal isogonal plane tilings. Pre- 

cisely 63 types can be realized by normal isogonal edge-to-edge tilings with 
convex polygonal tiles. 

• There exist precisely 26 homeomeric types of normal isotoxal plane tilings. Pre- 

cisely 6 types can be realized by a normal isotoxal edge-to-edge tiling with 
convex polygonal tiles. 

14. Let T be a convex d-polytope. If T tiles 1Z d by translation, then T admits 
(uniquely) a face-to-face lattice tiling of 1Z d . Such a tile T is called a parallelotope. 
This result is not true for nonconvex polytopes. 

15. Several aperiodic sets have been found in the plane. Some of them, such as 
the Wang tiles and Penrose tiles, possess several highly interesting properties. (See 
[GrSh86].) 

16. Very recently, considerable progress has been achieved for aperiodic tilings in higher 
dimensions via dynamical systems. (See [Ra95].) 

Open Questions: 

1. Extend the classification problems to higher dimensions. (At present, this looks 
hopeless.) 

2. Classify all convex d-polytopes which are prototiles of monohedral tilings of 1Z d . 
(This problem is not even solved for the plane.) However, under suitable restrictions 
the complexity of the problem changes. (See Fact 4.) 

3. For d > 5, determine whether each d-parallelotope is a Voronoi cell (see §9.2) for 
some d-lattice. (This is known to be true for 1 < d < 4.) 


13.3 COMBINATORIAL GEOMETRY 

This section studies geometric results involving combinatorics in the areas of convexity, 
incidences, distances, and colorings. In some cases the problems themselves have a 
combinatorial flavor, while in other cases their solution requires combinatorial tools. 


13.3.1 CONVEXITY 

In this subsection, questions of two different kinds are studied. Most of them belong to 
geometric transversal theory, a subject originating in Helly’s theorem. Another group 
of problems grew out of the Erdos-Szekeres theorem, which turned out to be a starting 
point of Ramsey theory. 

Definitions: 

A subset C of d-dimensional Euclidean space (d-space) lZ d is convex if the following is 
true: for any pair of points in C, the straight-line segment connecting them is entirely 
contained in C. 

A convex set is strictly convex if its boundary contains no line segment. 

A convex body is a compact (i.e., bounded and closed) convex set with nonempty 
interior. 
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A polytope is a bounded convex body that can be obtained as the intersection of 
finitely many closed half-spaces. (§13.1.1.) 

A convex polygon is a poly tope in the plane. 

A vertex of a polytope P is a point q £ P, for which there is a hyperplane (§13.1.1) H 
such that H n P = {q}. 

A point set is in convex position if it is the vertex set of a polytope. 

The convex hull of a set S C lZ d is the smallest convex set containing S. 

A family C = {Ci, C 2 , ■ ■ •} of sets in d-space is said to be intersecting if all members 
of C have a point in common. 

A set T C lZ d is a transversal of a family C of sets if T n C, is nonempty for every i. 
If C has a /c-element transversal (|T| = k ), its members can be pierced by k points. 

Two sequences P = {p\, . . . ,p n } and Q = {qi, ■ ■ ■ ,q n } of points in 1Z k have the same 
order type if, for all 1 < A < ii < • • • < ik+i < n, the orientations of the sim- 
plices induced by {p tl , . . . ,Pi k+1 } and {q^, . . . ,qi k+1 } are the same. This order type is 
nontrivial if P and Q are not contained in any hyperplane of lZ k . 

A fc-flat (an oriented /c-dimensional plane) F intersects a sequence of d-dimensional 
convex bodies C = {C±, . . . ,C n } consistently with the above order type if there are 
Xi £ F n Ci such that the sequences X = {xi . . . . , x n } and P have the same order type. 

Facts: 

1. The convex hull of a set S is the intersection of all convex sets containing S. 

2. For any set S C lZ d of finitely many points, not all of which lie in the same hyper- 
plane, the convex hull of S' is a polytope. In particular, if S has d + 1 points, then its 
convex hull is a simplex whose vertices are the elements of S. 

3. The convex hull of the vertex set of any convex polytope P is identical with P. 

4. Helly’s theorem: If a family C of at least d+1 convex bodies in lZ d has the property 
that every d+ 1 of its members have a point in common, then C is intersecting (i.e., all 
its members have a point in common) . [He23] 

5. Caratheodory’s theorem: If the convex hull of a set S C lZ d contains a point p , then 
there exists a subset of S with at most d+1 elements whose convex hull contains p. 

6. Let S be a compact set in lZ d with the property that for every (d+l)-element subset 
T C S, there is a point s £ S such that each segment connecting s to an element of T 
lies in S. Then S has a point such that every segment connecting it to an element of S 
is entirely contained in S. [Kr46] 

7. Any set of (k — l)(d + 1) + 1 points in lZ d can be partitioned into k parts whose 
convex hulls have a point in common. [Ra21], [Tv66] 

8. Let C be any family of convex bodies in lZ d with the property that the volume of 
the intersection of any 2d of them is at least 1. Then the volume of the intersection of 
all members of C is at least a positive constant depending only on d. [BaKaPa82] 

9. For any e > 0 and for any d there is a 6 > 0 satisfying the following condition: 
if C is a family of n (> d+ 1) convex bodies in lZ d having at least e( d " 1 ) intersecting 
(d+l)-tuples, then C has at least 5n members with a point in common. [Ka84] 

10. For any d < q < p, there exists k = k(p , q , d) satisfying the following condition: 
if C is a family of convex bodies in lZ d such that every subfamily of C of size p contains q 
members with a point in common, then C can be pierced by k points. [A1K192] 
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11 . A sequence C = {C\, . . . , C n } of convex bodies in lZ d has a hyperplane transversal 
if and only if for some 0 < k < d — 1, there is a nontrivial fc-dimensional order type of n 
points such that every (/c+ 2) -member subfamily of C can be met by a suitable k-h at 
consistently with that order type. [PoWe90] 

12 . If Sk is any family of /c-dimensional linear subspaces of lZ d with the property that 
any ( fe ^) of them can be intersected by an /-dimensional subspace, then all members 
of Sk can be intersected by an /-dimensional subspace. (Two subspaces intersect each 
other if they have at least one point in common, different from the origin.) 

13 . Any set of five points in the plane, no three of which are on a line, has four elements 
in convex position. 

14 . Erdos-Szekeres theorem : For every k > 2, there exists a smallest integer n(k) with 

the property that every set of at least n(k) points in the plane, no three of which are 
on a line, contains k points in convex position. If k = 3,4, 5,6, then n(k ) = 2™ -2 + 1. 
(These are the only known values.) Furthermore, 2 k ~ 2 + 1 < n(k) < [ErSz35], 

[ToVa98] 

Examples: 

1. Let S = {(1, 0, 0), (0, 1, 0), (0, 0, 1)}. The convex hull of S C 1Z 3 is a triangular 
region, which is not a convex body in 3-space because its interior is empty. 

2 . Let S = {(0, 0), (1, 0), (0, 1)}. The convex hull of S C 1Z 2 is a triangular region, 
which is a convex body (polygon) in the plane. 

3 . Let S = { (x, y) | 0 < x < 2, 0 < y < 2 } and S' = { (x, y) | 0 < x < 3, 0 < y < 3 }. 
The family of all axis-parallel unit squares lying in S is intersecting because each of 
them contains the point (1,1). The family of axis parallel unit squares in S' can be 
pierced by four points: (1, 1), (1, 2), (2, 1), (2, 2). 

4. In the line, {1,3,4, 2} and {0,4,25,3} have the same (1-dimensional) order type. 
The 3-dimensional closed unit balls centered at (0, 1, 5), (0, 0, 9.6), (0, 0, 9.4), (1, 0, 7) are 
met by the 2 :-axis consistently with the above order type, because these balls contain 
the points (0,0,5), (0,0,9), (0,0,10), and (0,0,7), respectively, and the order type of 
this sequence along the 2 -axis is the same as the 1-dimensional order type of {1, 3, 4, 2}. 


13.3.2 INCIDENCES 

This subsection studies the structure (and number) of incidences between a set of points 
and a set of lines (or planes, spheres, etc.). The starting point of many investigations 
in this field was the Sylvester-Gallai theorem. 

Definitions: 

Given a point set P and a set L of lines (or /c-flats, spheres, etc.) in Euclidean cZ- 
space lZ d , a point p £ P and a line / £ L are incident with each other, if p £ l. 

Given a set L of lines in the plane, a point incident with precisely two elements of L 
is called an ordinary crossing. Given a set of points P C TZ d , a hyperplane passing 
through precisely d, elements of P is called an ordinary hyperplane (for d = 2, an 

ordinary line). 

Given a set of points P , a Motzkin hyperplane is a hyperplane h such that all but 
one element of h fl P lie in a (d— 2)-flat. 
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A family T of curves in the plane has d degrees of freedom if there exists an integer s 
such that: 

• no two curves in T have more than s points in common; 

• for any d points, there are at most d curves in T passing through all the points. 

A family of pseudolines is a family of simple curves in the plane with the property 
that every two of them meet precisely once. 

A family of pseudocircles is a family of simple closed curves in the plane with the 
property that every two of them meet in at most two points. 


Facts: 

1. Sylvester-Gallai theorem: Every finite set of points in the plane, not all of which 
are on a line, determines an ordinary line. In dual version: every finite set of straight 
lines in the plane, not all of which pass through the same point, determines an ordinary 
crossing. 

2 . For every finite set of points in Euclidean d-space, not all of which lie on a hyper- 
plane, there exists a Motzkin hyperplane. [Ha65], [Ha80] 

3 . Every set of n points in d-space, not all of which lie on a hyperplane, determine at 
least n distinct hyperplanes. 

4. In 3-space, every set of n non-coplanar points determines at least n? Motzkin hy- 
perplanes. 

5 . If n is sufficiently large, then every set of n non-cocircular points in the plane 
determines at least ("g *) distinct circles, and this bound is best possible. [E167] 

6. Every set of n (>7) noncollinear points in the plane determines at least ordinary 
lines. This bound is sharp for n = 13 and false for n = 7. [CsSa93] 

7 . There is a positive constant c such that every set of n points in the plane, not all 
on a line, has an element incident with at least cn connecting lines. Moreover, any set 
of n points in the plane, no more than n — k of which are on the same line, determines 
at least c'kn distinct connecting lines, for a suitable constant c! > 0. According to the 
d = 2 special case of Fact 4, due to de Bruijn-Erdos, for k = 1 the number of distinct 
connecting lines is at least n. For k = 2, the corresponding bound is 2n — 4 (for n > 10). 
[Be83], [SzTr83] 

8. Every set of n noncollinear points in the plane always determines at least 2|_§J lines 
of different slopes. Furthermore, every set of n points in the plane, not all on a line, 
permits a spanning tree, all of whose n — 1 edges have different slopes. [Un82], [Ja87] 

9. The number of incidences between a set P of points and a set L of lines can be 
obtained by summing over all l € L the number of points in l belonging to P, or, 
equivalently, by summing over all p £ P the number of lines in L passing through p. 

10 . Let r be a family of curves in the plane with d degrees of freedom. Then the 
maximum number of incidences between n points in the plane and m elements of T is 

0(n d /( 2d_1 )m( 2d_2 )/( 2d_1 ) + n + m). [PaSh98] 

From the most important special case, when T is the family of all straight lines (d = 2), 
it follows that for any set P of n points in the plane, the number of distinct straight lines 
containing at least k elements of P is 0(p- + f ) [SzTr83]. This bound is asymptotically 
tight. The same result holds for pseudolines. 
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11 . The maximum number of incidences between n points and m spheres in 1Z 3 is 

0(ni fi(n, m) + n 2 ), 

where f3(n,m ) = o(log(nm)) is an extremely slowly growing function. 

If no three spheres contain the same circle, then the following better bound is 
obtained: 

0(nimi + n + in). 

Neither of these estimates is known to be asymptotically tight. [ClEtal90] 

12 . The maximum number of collinear triples determined by n points in the plane, no 
four of which are on a line, is at least — 0(n). This bound is asymptotically tight. 
[BuGrS179] 

13 . If M{n) denotes the minimum number of different midpoints of the (™) line seg- 
ments determined by n points in convex position in the plane, then 

(„) _ p (ra+ l)(l- e -i/2) j < M(n) < ^ _ [ » 2 ~ |g+12 j, 

[ErFiFii91] 

Examples: 

1. Let P be a set of 7 points in the plane, consisting of the vertices, the centroid (the 
point of intersection of the medians), and the midpoints of all sides of an equilateral 
triangle. Then P determines 3 ordinary lines (the lines connecting the midpoints of two 
sides). 

2 . Let P be a 4/c-element set in the plane that can be obtained from the vertex set 
{iq, r> 2 , • • • , V 2 k} of a regular 2fc-gon by adding the intersection of the line at infinity 
with every line ViVj. Then the set P determines precisely 2k ordinary lines: every 
line connecting some Vi to the intersection point of Vi-iVi+i and the line at infinity 
(1 < i < 2k, the indices are taken modulo 2k). (It can be achieved by a suitable 
projective transformation that no point of P is at infinity, and the number of ordinary 
lines remains ^ = 2k.) 

3 . Let P be a set of n > 4 points lying on two noncoplanar lines in 3-space so that 
there are at least two points on each line. Not all points of P are coplanar, but P does 
not determine any ordinary plane. 

4. The family of all straight lines in the plane and the family of all unit circles both 
have 2 degrees of freedom. The family of all circles with arbitrary radii has 3 degrees of 
freedom. The family of the graphs of all polynomials of one variable and degree d has d 
degrees of freedom. 

5. Let P be an n s x n® part of the integer grid; i.e., 

P = { (i,j) | 1 < « < ns, 1 < j < n® }. 

Let k = (-T/j) > 2, where c > 0 is a sufficiently small constant. For every 1 < s < 

1/2 

r < k and for every 1 < * < r, 1 < j < " 2 , consider the line passing through (i.j) 
and (i + r, j + s) . If c is sufficiently small, then the number of these lines is at most m. 
There is a constant d > 0 such that the total number of incidences between these lines 
and the elements of P is at least c'n^m^. (See the case d= 2 of Fact 10.) 
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13.3.3 DISTANCES 


The systematic study of the distribution of the (!J) distances determined by n points 
was initiated by Erdos. Given a set of n points P = {pi,P 2 , • • • , Pn } , let g{P ) denote 
the number of distinct distances determined by P, and let /(P) denote the number of 
times that the unit distance occurs between two elements of P. That is, /(P) is the 
number of pairs PiPj,i < j, such that \pj. ~Pj\ — 1. In [Er46], Erdos raised the following 
general questions: What is the minimum of g(P) and what is the maximum of /(P) 
over all n-element subsets of Euclidean d-space or of any other fixed metric space? 

Definitions: 

For any point set P in a metric space, the unit distance graph of P is the graph G(P) 
whose vertex set is P and two points (vertices) are connected by an edge if and only if 
their distance is 1. 

Let P be a finite set of points in a metric space. If the distance between two points 
p,q & P is minimum, then p and q form a closest pair. 

A point q & P is a nearest neighbor of p £ P, if no point of P is closer to p than q. 

A set P in a metric space is a separated set if the minimum distance between the 
points of P is at least 1. 

The diameter of a finite set of points in a metric space is the maximum distance 
between two points of the set. 

A point q £ P is a farthest neighbor of p £ P, if no point of P is farther from p 
than q. 

A set of points in the plane is said to be in general position if no three are on a line 
and no four on a circle. 

Facts: 

1. /(P) is equal to the number of edges of G(P ). 

2 . If p and q form a closest pair in P, then q is a nearest neighbor of p and p is a 
nearest neighbor of q. 

3 . If the distance between p and q is equal to the diameter of P, then q is a farthest 
neighbor of p and p is a farthest neighbor of q. 

4. The maximum number of times that the unit distance can occur among n points 
in the plane is 0(n 4 / 3 ). Conjecture: the asymptotically best bound is G(n 1+C / loglogI1 ). 
[SpSzTr84] 

5 . The maximum number of times that the unit distance can occur in a separated set 
of n < 3 points is [3n — \J\1n — 3J . [Ha74] 

6. The maximum number of times that the unit distance can occur in a set of n points 
in the plane with unit diameter is n. [HoPa34] 

7. For any set of n > 3 points in the plane, the total number of farthest neighbors of 
all elements is at most 3n — 3 if n is even, and at most 3n — 4 if n is odd. These bounds 
cannot be improved. [EdSk89] 

8. The maximum number of times that the unit distance can occur among n points in 
convex position in the plane is O(nlogn). For n > 15, the best known lower bound is 
2n — 7. [Fii90], [EdHa91], 
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9 . The minimum number of distinct distances determined by n points in the plane 
is f 2(ns). It is conjectured that the best bound is fl( -y = — ). [Sz97] 


10 . The minimum number of distinct distances determined by n > 3 points in convex 
position in the plane is . [A163] 


11 . The minimum number of distinct distances determined by n > 3 points in the 
plane, no three of which are on a line, is at least Conjecture: the best possible 

bound is U J- 


12 . The minimum number of distinct distances determined by n points in general 

position in the plane is O(n 1+C ^ logn ), for some positive constant c. However, it is not 
known whether this function is superlinear in n. [ErEtal93] 


13 . There are arbitrarily large noncollinear finite point sets in the plane such that all 
distances determined by them are integers, but there exists no infinite set with this 
property. 


14 . In an n-element planar point set, the maximum number of noncollinear triples 
that determine the same angle is 0(n 2 logn), and this bound is asymptotically tight. 
[PaSh90] 

15 . Let / 3 (?r) denote the maximum number of times that the unit distance can occur 
among n points in 1Z 3 . Then 

fi(ns loglogn) < fs (n) < n^/3(n), 

where /3(n) = o(log log n) is an extremely slowly growing function. [ClEtal90] 


16 . The maximum number of times that the unit distance can occur in a set of n > 4 
points in 1Z 3 with unit diameter is 2n — 2. [Gr56] 

17 . If n is sufficiently large, then for any set of n points in 7 Z 3 , the total number of 

farthest neighbors of all elements is at most y- + ^ + 3 if n is even, at most y- + y 1 + | 
if n = 1 (mod 4), and at most + = (mod 4). These bounds cannot be 

improved. [Cs96] 

18 . Let fd(n) denote the maximum number of times that the unit distance can occur 
among n points in 7 Z d . If d > 4 is even, then 

fd(n) = 1 - JJJ ) + n-0{d). 

If d > 5 is odd, then 

fd(n) = 0 M)- [Er60], [ErPa90] 

19 . Let ^(n) denote the maximum of the total number of farthest neighbors of all 
points over all n-element sets in lZ d . For every d > 4, 

4>d(n) = n 2 ^1 - -|4j + o(l)^ . [ErPa90] 


Examples: 

1. Let P be the vertex set of a regular n-gon (n > 3) in the plane. Then g(P ), the 
number of distinct distances determined by P, is equal to |_§J- The number of times 
that the diameter of P is realized is equal to n if n is odd, and ^ if n is even. 
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2. Take a regular hexagon of side length k and partition it into 6 k 2 equilateral triangles 
with unit sides. Let P denote the union of the vertex sets of these triangles. Then P is 
a separated set, and |P| = n = 3 k 2 + 3 k + 1. The number of times that the minimum 
(unit) distance occurs between two elements of P is 9 k 2 + 3k = 3 n~ sjYln — 3. 

3. Let P denote an nh x nh part of the integer grid; i.e. , let P = {{x,y) | 1 < 

x,y < n3 }. It follows from classical number theoretic results that there exists an 
integer k (jg < k < that can be written as the sum of two squares in 2 n 1 o « lo s ” 
different ways, for a constant c > 0. Thus, for every ( x,y ) G P, the number of points 
(x',y') € P satisfying (x — x 1 ) 2 + (y — y') 2 = k is at least 2 n >°e 1o « n . In other words, 
the distance k 2 occurs n 1+lo « lo s" times among the elements. By proper scaling, an 
n-element point set P is obtained in which the unit distance occurs n +lo « lo « n times. 
That is, f(P') = n l+ 1o « lo « " . It can also be shown that the number of distinct distances 
determined by P' satisfies g{P') = g(P) = f° r a suitable positive constant c'. 

4. Lenz’ construction: Let C\, . . . , be circles of radius centered at the origin 

of lZ d , and assume that the supporting planes of these circles are mutually orthogonal. 
Choose rii points on Ci, where rq = or m = so that Y^i n i = n. It is 

clear that any pair of points belonging to different circles C) are at unit distance from 
each other. Hence, this point system determines at least 

£ ( X - Ifj ) + n ~ °( d ) 

unit distances. 

5. Let Pi,P 2 ,Ps,Pa be the vertices of a regular tetrahedron with side length 1 in P 3 . 
The locus of points in 3-space lying at unit distance from both pi and P 2 is a circle 
passing through p 3 and jq. Choose distinct points ps,p§, ■ ■ ■ ,p n on the shorter arc of 
this circle between p 3 and jq. An n-element point set in 1Z 3 is obtained with diameter 1 
and in which the diameter occurs 2 n — 2 times. 


13.3.4 COLORING 

One of the oldest problems in graph theory is the Four Color Problem (§8.6.4). This 
problem has attracted much interest among professional and amateur mathematicians, 
and inspired a lot of research about colorings, including Ramsey theory [GrRoSp90] 
and the study of chromatic numbers, polynomials, etc. In this section, some coloring 
problems are discussed in a geometric setting. 

Definitions: 

A coloring of a set with k colors is a partition of the set into k parts. Two points that 
belong to the same part are said to have the same color. 

The chromatic number of a graph G is the minimum number of colors, x(G), needed 
to color the vertices of G so that no two adjacent vertices have the same color. 

The chromatic number of a metric space is the chromatic number of the unit 
distance graph of the space; that is, the minimum number of colors needed to color all 
points of the space so that no two points of the same color are at unit distance. 
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The polychromatic number of a metric space is the minimum number of colors, %, 
needed to color all points of the space so that for each color class © (1 < i < x) there is 
a distance di with the property that no two points of this color are at distance di from 
each other. 

A point set P in lZ d is k-Ramsey if for any coloring of lZ d with k colors, at least one 
of the color classes has a subset congruent to P. If for every k, there exists d(k) such 
that P is k - Ramsey in 7 Z d ^ k \ then P is called Ramsey. 

A point set P' is called a homothetic copy (or a homothet) of P, if P and P' are 
similar to each other and they are in parallel position. 

Facts: 

1. The minimum number of colors needed for coloring the plane so that no two points 
at unit distance receive the same color is at least 4 and at most 7. That is, the chromatic 
number of the plane is between 4 and 7. [JeTo95] 

2 . The following table contains the best known upper and lower bounds on the chro- 
matic numbers of various metric spaces. (S^ -1 ©) denotes the sphere of radius r in 
d-space, where the distance between two points is the length of the chord connecting 
them.) 


space 

lower bound 

upper bound 

line 

2 

2 

plane 

4 

7 

rational points of plane 

2 

2 

3-space 

5 

21 

rational points of P 3 

2 

2 

S 2 (r), l < r < 

3 

4 

S 2 (r), %^<r< ^ 

3 

5 

S 2 (r),r>^ 3 

4 

7 


4 

4 

rational points of P 4 

4 

4 

rational points of P 5 

6 

< 00 

n d 

(1 + o(l))(1.2) d 

(3 + o(l)) d 

S d ~ l {r), r>\ 

d 

< 00 


3 . The polychromatic number of the plane is at least 4 and at most 6. [So94] 

4 . For any finite d-dimensional point configuration P and for any coloring of d-space 
with finitely many colors, at least one of the color classes will contain a homothetic copy 
of P. The corresponding statement is false if “homothetic copy of P” is replaced by 
“translate of P”. 

5 . A necessary condition for a finite set P to be Ramsey is that it be spherical; i.e., all 
its points lie on a sphere. [GrRoSp90] 
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6 . The following conditions are sufficient for a finite set P to be Ramsey: 

• P is the vertex set of a right parallelepiped; 

• P is the set of points in d-space with exactly k ( k < d ) nonzero coordinates having 

values aq , . . . ,Xk in this order, where x\, ... ,Xk is an arbitrary sequence of 
nonzero reals; 

• P is the vertex set of a regular n-gon; 

• P is a subset of a Ramsey set; 

• P is the cartesian product of two Ramsey sets. [FrRb86] , [FrRo 90 ] 

7 . It follows from the first two and the last two conditions of Fact 6 that all “triangles” 
are Ramsey. Moreover, given any nondegenerate point configuration (“simplex”) S, 
there is a constant c(S ) > 1 such that for every k < c d (S ), S is fc- Ramsey in d-space. 

Examples: 

1 . Let G be a graph on the vertex set {iq, . . . , iq}, whose edges are V1V2, V1V3, tqiq, 
V1V5, V2V3, V2V6, V3VQ, V4V5, V4V7, V5V7, and V6V7. The chromatic number of G is 4 . 

2 . The graph G of Example 1 can be embedded in the plane so that if two of its 
vertices are connected by an edge, then the corresponding points in the plane are at 
unit distance. In other words, G is a subgraph of the unit distance graph of the plane. 
(In every such imbedding, the points corresponding to {iq, V2, V3, ^6} and {iq, V4, U5, V7} 
form two rhombi of side length 1 that share a vertex.) Hence, the chromatic number of 
the plane is at least 4 . 

3 . Let P be a 2 -element point set in Euclidean space. For every positive integer k, P 
is fc-Ramsey in fc-space. (To see this, consider a regular simplex in fc-space, whose side 
length is equal to the distance between the elements of P. Any coloring of lZ k induces a 
coloring of the vertices of this simplex, and, by the pigeonhole principle, one can always 
find two vertices that get the same color. They form a 2 -element set congruent to P. 
Thus, P is Ramsey.) 


13.4 POLYHEDRA 


This section presents basic properties of polyhedra, commonly known as (planar) solids. 
Any application such as geometric modeling that models the three-dimensional world 
of objects must deal with polyhedra. Basic geometric and combinatorial properties of 
polyhedra as well as their convex decompositions and triangulations are discussed. 


13.4.1 GEOMETRIC PROPERTIES OF POLYHEDRA 

Definitions: 

A (d-1)- dimensional plane is the solution set of the linear equation aiaq + a 2X2 + 
■ ■ ■ + ddXd = CLd+ 1, where aq, 02, . . . , cm+i are constants and x ±, . . . , Xd are d variables. 

A hyperplane in d-dimensional Euclidean space lZ d is the set of all points on a (d— 1 )- 
dimensional plane. 
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A closed halfspace in 7Z d is the set of all points on the hyperplane together with the 
points on one side of the same hyperplane. 

A convex d-polyhedron is the intersection of a finite number of closed halfspaces 

in n d . 

A nonconvex polyhedron is the union of a set of convex polyhedra such that the 
underlying space is connected and nonconvex. 

A k-face , part of the boundary of a polyhedron, lies on at least d — k hyperplanes 
forming the boundary. In particular, 0-faces, 1-faces and (d— l)-faces of a d-polyhedron 
are vertices, edges, and facets, respectively. 

A polytope ( d-polytope ) is a convex d-polyhedron that is contained in the interior 
of some d-dimensional cube; that is, a bounded convex d-polyhedron. 

A d-polytope is regular if all its facets are regular (d— l)-polytopes that are combina- 
torially equivalent. A vertex is a regular 0-polytope. 

Two polytopes P and Q are dual polytopes if there exists a one-to-one correspon- 
dence S between the set of faces of P and Q such that two faces fi, f 2 G P satisfy 
fi C f 2 if and only if d(/i) D S(f 2 ) in Q. 

A manifold ( d-manifold ) is a topological space that is locally homeomorphic to lZ d 
everywhere. 

A manifold d-polyhedron is a polyhedron whose boundary is topologically the same 
as a (d— l)-manifold. That is, every point on the boundary of a manifold d-polyhedron 
has a small neighborhood that looks like lZ d . 

A non-manifold d-polyhedron is a d-polyhedron whose boundary is not a manifold. 

A manifold 3-polyhedron has genus g if its boundary is a 2-manifold with genus g. A 
2-manifold surface has genus g if every set of g + 1 circular cuts separate the surface, 
but not all sets of g circular cuts do. 

Edges in a 3-polyhedron are reflex edges if the inner angle subtended by two faces 
meeting at that edge is greater than 180°. 

Facts: 

1. Every polytope has a dual polytope. 

2 . Every polytope is the convex hull of its vertices. 

3 . A k- face is an open set of dimension k. 

4. Curvature : The curvature k v of a manifold 3-polyhedron at a vertex v is 

, _ 

~ 277 ’ 

where 0i is the angle between two consecutive edges incident with v. Intuitively, curva- 
ture at a vertex measures its “sharpness” . 

5. Gauss-Bonnet theorem: = 2 — 2 g. 

V 

6. Angle sums: Let / be a face of a polytope P and p an interior point of /. The angle 
at / is measured as the fraction of P covered by a sufficiently small (d— l)-dimensional 
sphere centered at p. If au is the sum of angles at all ^-dimensional faces, then 

d—l 

(— 1 = (— l) d ~ l . ( Gram’s formula) 

k = 0 

[Gr67], 

7 . A 3-polyhedron is convex if and only if it does not have any reflex edges. 
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Examples: 

1. Tetrahedra and cubes are manifold 3-polyhedra with genus 0. See the following 
figure. 



2. Two cubes meeting at a single edge, or two tetrahedra meeting at a single vertex, 
form non-manifold 3-polyhedra. See the following figure. 



3. A cube has genus 0, but a cube with a cubical through-hole is a manifold 3- 
polyhedron with genus 1. See the following figure. 



4. A cube and a octahedron (bipyramid) are dual to each other; a tetrahedron is dual 
to itself. 

5. There are five regular polytopes in three dimensions: tetrahedron, cube (hexahe- 
dron), octahedron, dodecahedron, icosahedron. They are also called Platonic solids. 
See the following figure. 



6. There is a circular cut for a toroidal surface that does not separate it, though any 
two circular cuts always separate it. 
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13.4.2 TRIANGULATIONS 


A complex domain is decomposed into simple parts for computational simplicity in many 
applications. For example, in finite element methods often a domain is triangulated into 
simplices. 

Definitions: 

A simplex ( d-dimensional simplex or d-simplex ) is a d-polytope with d+ 1 vertices. 

A triangulation of a d-polyhedron is a convex decomposition where each convex piece 
of the decomposition is a d-simplex. 

A polyhedron is triangulated with Steiner points if the vertex set of simplices in 
the triangulation is strictly a superset of the set of vertices of the polyhedron. This 
type of triangulation uses extra points (other than the vertices of the polyhedron) as 
vertices. 

A triangulation of a polyhedron is a simplicial complex if for every two simplices ay, a 2 
in the triangulation, ay D ay is either empty or a face of both simplices. 

A convex decomposition of a polyhedron is its partition into convex pieces that have 
disjoint interiors. 

The aspect ratio of a simplex is the ratio of the radius of the circumscribing sphere 
to the radius of the inscribing sphere of the simplex. 

The aspect ratio of a triangulation is the largest aspect ratio of a simplex in the 
triangulation. 

Facts: 

1. Every d-polytope can be triangulated without Steiner points. 

2. Every d-polytope with n faces can be triangulated into 0(n ) simplices in 0(n ) time 
and space. 

3. There are nonconvex 3-polyhedra that can’t be triangulated without Steiner points. 

4. The problem of deciding if a nonconvex 3-polyhedron can be triangulated without 
Steiner points or not is NP-complete. [RuSe92] 

5. The problem of decomposing a polyhedron into the minimum number of convex 
pieces is NP-hard. 

6. Every polyhedron can be decomposed into disjoint convex pieces by repeatedly slic- 
ing the polyhedron through reflex edges. [BaDe92] and [Ch84] 

7. There is a class of polyhedra with n edges, of which r are reflex, that require at 
least Q(n + r 2 ) convex pieces for its decomposition. These polyhedra have two sets 
of parallel edges which are created as reflex edges. Two such sets are placed on two 
hyperbolic paraboloids with an angle of almost 90° between them. These polyhedra 
require at least 0(n + r 2 ) convex pieces for its decomposition. [Ch84] 

8. Every manifold 3-polyheclron can be triangulated into 0(n+r 2 ) tetrahedra in 0((n+ 
r 2 ) log r) time. [ChPa90] 

9. There exists a polynomial time algorithm that produces a triangulation of any 3- 
polyhedron with an aspect ratio and size that are within a constant factor of the optimal. 
[MiVa92] 
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13.4.3 


Examples: 

1 . A triangle is a 2-simplex; a tetrahedron is a 3-simplex. 

2. Part (a) of the following figure shows a tetrahedron with a bad aspect ratio; part (b) 
shows a tetrahedron with a good aspect ratio. 




3. The Schonhardt polyhedron is a nonconvex 3-polyhedron that cannot be triangu- 
lated without Steiner points. This polyhedron can be constructed out of a prism whose 
base and top facets are equilateral triangles. Twist the top triangle, keeping the base 
fixed. This destroys the planarity of vertical facets. To maintain the planarity, triangu- 
late these facets appropriately. [RuSe92] 


FACE NUMBERS 

In many cases complexity of algorithms dealing with polyhedra depend on the number 
of their faces. Therefore, combinatorial bounds on these numbers play a significant role 
in analyzing these algorithms. 

Definitions: 

A cyclic d-polytope is the convex hull of a set of n (n > d+ 1) points on the moment 
curve in lZ d , x(t) = (t, t 2 , . . . , t d ). 

A face vector of a d-polyhedron P is the d-dimensional vector (/o, fi, . . . , fd-i), where 
fi = fi(P) is the number of i-dimensional faces of P. 

A simplicial polytope is a polytope in which all faces are simplices. 

Facts: 

1. For 2k < d, every k vertices of a cyclic polytope define a (k— l)-face. 

d - 1 

2. Euler’s relation : For any d-polytope, E (— 1)*/* = 1 — (— l) d . 

2—0 

2 

3. For a manifold 3-polyhedron with genus g , E( — l)*/i = 2 — 2 g. 

2=0 

4. The edges on the boundary of a manifold 3-polyhedron with genus 0 form a planar 
graph. By the property of planarity, the number of vertices, edges, and facets of such 
polyhedra are within a constant factor of each other. 

5. Dehn-Sommerville’s equations : The face- vectors of simplicial polytopes satisfy the 
following equations for— 1 < A: < d — 2 with /_ i = 1: 

E k d : E(-l = (-1 ) d -7*- 

j = k 

In particular, EJ 1 is Euler’s relation. 
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6 . Upper bound theorem: For any d-polytope P with n vertices, f%{P ) = 0(nL 2J) for 

1 < i < d- 1 . 

7 . Optimality of cyclic polytopes: Cyclic polytopes achieve the upper bound since 
they have (^) = fl(n k ) (k— l)-faces for 2k < d. This implies that they have 0(v\ d / 2 ^) 

J -faces. 


Examples: 

1. The 3-dimensional cube (d = 3) has /o = 8,/i = 12 and fi = 6; thus, by Fact 2, 
fo~fi + h = 2. 

2 . A 3-dimensional cube with a cubical through-hole (<7 = 1) has /o = 16, f± = 24 and 
h = 10; thus, by Fact 3, /o - fi + / 2 = 0. 


13.5 ALGORITHMS AND COMPLEXITY IN COMPUTATIONAL 
GEOMETRY 


Computational geometry studies efficient algorithms for solving geometric problems and 
has applications in computer graphics, robotics, VLSI design, computer-aided design, 
pattern recognition, statistics, and other fields. The study of computational geometry 
uses concepts and results from classical geometry, topology, combinatorics, as well as 
standard techniques from design and analysis of computer algorithms. See [PrSh85] and 
[GoO’R97], 


13.5.1 CONVEX HULLS 

Finding efficient algorithms for the construction of convex hulls has been a central topic 
in computational geometry. Several efficient algorithms for constructing boundaries of 
convex hulls of sets of points in the plane have been developed. 

Definition: 

The convex hull of a set of points in lZ d is the smallest convex set containing the 
points. 

Algorithms: 

1 . Finding boundaries of convex hulls by rotational sweeping: 

• GrahamScan : Given a set S of n points in the plane, Algorithm 1 scans the points 
rotationally around a fixed point and eliminates those that are not hull vertices. 
The remaining points are the vertices of the boundary of the convex hull of S. The 
running time of GrahamScan is 0(?rlog?r), which is dominated by the sorting of the 
points. The remaining steps take only linear time. 

• Jarvis’ March: Given a set S of n points in the plane, Jarvis’ March algorithm 
constructs the boundary of the convex hull by “marching around” the outer perimeter 
of S. This method is also called “gift-wrapping”. Jarvis’ March runs in time O(hn), 
where h is the number of vertices of the convex hull, which, in the worst case is n. 
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Algorithm 1 : Graham scan. 

input: a finite set S of points in the plane 

output: the vertices of the boundary of the convex hull of S 

Po := the point in S with the minimum ^-coordinate 

sort remaining points by polar angle around po~ append the point po to the 

end of the sorted list; let the resulting list be (pi,P 2 , ■ ■ ■ , p n ), where p n = po- 

H[ 1] :=p„; H[ 2] := py, j := 2 

for i := 2 to n 

while the path {H[j — 1], H\j],pi} does not form a left turn 

j ■= j - 1 
j := j + 1 
H\j\ := Pi 

{H[l], H[2 ], . . . , H[j] is the boundary of the convex hull.} 


2. Divide-and-conquer algorithms: 

• QuickHull: This algorithm recursively constructs a chain on the boundary of 
the convex hull, connecting two hull vertices u and v. It first finds a hull vertex w 
on the chain (for example, w is the farthest point from the line uv). Then the 
subchains connecting u and w, w and v, respectively, are constructed recursively 
and are concatenated. [PrSh85] 

QuickHull runs practically fast, but in the worst case the running time of Quick- 
Hull is 0(n 2 ). 

• MergeHull : This algorithm first partitions the set S of points into two subsets Si 
and S 2 of equal size and then recursively constructs the boundaries of the convex 
hulls CH(S'i) and CH(S' 2 ). Finally, CH(5i) and CH(S I 2 ) are “merged” into the 
convex hull of the set S. [PrSh85] 

The boundary of the convex hull for S is the same as the boundary of the convex 
hull for the hull vertices of CH(Si) and CH(S , 2 ). Thus, to construct the boundary 
of CH(S'), first sort the hull vertices of CH(S’i) and CH(S’ 2 ) (this sorting can be 
done in linear time), then apply the linear scan of GrahamScan to construct CH(S'). 
Therefore, the boundary of the convex hull CH(S') can be constructed from CH(S'i) 
and CH(S' 2 ) in linear time. The running time of MergeHull is O(nlogn). 

3. Other methods: 

• incremental method: The incremental method for constructing the boundary of 
the convex hull of a set of points in the plane adds one point at a time to an already 
constructed boundary of a convex hull. This method has time complexity 0(n log n) 
[PrSh85]. An advantage of this method is that it can be generalized to construct 
boundaries of convex hulls in higher dimensions [Ed87]. 

• An algorithm by Kirkpatrick and Siedel based on the prune-and-search method: 
This algorithm partitions a given set of points in the plane into two linearly separable 
subsets of equal size, finds the two edges of the boundary of the convex hull that 
“bridge” these two subsets, and recursively constructs the subchains on the boundary 
of the convex hull between these two bridges. This method has time complexity 
0(nlogh), where h is the number of vertices on the boundary of the convex hull 
[Ya90] . 
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Facts: 


1. The problem of finding convex hulls is at least as hard as sorting. The lower 
bound 0 (?rlogn) of sorting on comparison decision trees also applies to the convex hull 
problem. This lower bound fi(nlogn) can be extended to a more general computation 
model, the bounded-degree algebraic decision trees. 

2. An 0(n log n) time algorithm for constructing the boundary of the convex hull of 
a set of points in 1Z 3 has been developed, which is a generalization of the MergeHull 
algorithm. 

3. For dimension d > 3, the convex hull of n points in lZ d can have up to 0(n Lsl) faces. 

An algorithm based on the incremental method has been proposed to construct the 
convex hull for a set of n points in lZ d in time [Ya90]. An optimal algorithm 

of time 0 ( 71 ^ 2 !) has been developed recently by Chazelle. (See the bibliography of 
[Mu94] for a reference.) 

Examples: 

1 . The following figure shows a set S' of 7 points in the plane and the convex hull of S. 


A 


B 
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2. The convex hull of the set {(0,0,0), (0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), 
(1, 1,0), (1, 1, 1)} in 1Z 3 is the surface of the unit cube together with its interior. 


13.5.2 TRIANGULATION ALGORITHMS 

Triangulation plays an important role in many applications. On a triangulated planar 
straight-line graph, many problems can be solved more efficiently. Triangulation of a 
set of points arises in numerical interpolation of bivariate data and in the finite element 
method. 

Definitions: 

A planar straight-line graph ( PSLG ) is a planar graph such that each edge is a 
straight line. 

A triangulation of a simple polygon P is an augmentation of P with nonintersecting 
diagonal edges connecting vertices of P such that in the resulting PSLG, every bounded 
face is a triangle. 

A triangulation of a PSLG G is an augmentation of G with nonintersecting edges 
connecting vertices of G so that every point in the interior of the convex hull of G is 
contained in a face that is a triangle. In particular, the PSLG to be triangulated can 
be simply n discrete points in the plane. 


© 2000 by CRC Press LLC 



A chain is a PSLG with vertices Vi, . . . ,v n and edges {fi, ^ 2 }, {^ 2 , ^ 3 }, • ■ • , {v n -i,v n }. 

A chain is monotone if there is a straight line L such that every line perpendicular 
to L intersects the chain in at most one point. 

A simple polygon is monotone if its boundary can be decomposed into two monotone 
chains. 

Two vertices v and u in a polygon are visible from each other if the open line segment uv 
is entirely in the interior of the polygon. 

Facts: 

1. Every simple polygon can be triangulated. 

2 . Every triangulation of a simple polygon with n vertices has n — 2 triangles and n — 3 
diagonals. 

3 . Given a simple polygon with n vertices, there is a diagonal that divides the polygon 
into two polygons that have at most [-^] + 1 vertices. 

4. For a history of triangulations, see [0’R87]. 

5 . Simple polygons can be triangulated in O(n) time using an algorithm developed by 
Chazelle. [Ch91]. 

6. PSLGs can be triangulated in O(nlogn) time. (See the triangulation of a general 
PSLG algorithm — item 3 in the following list of algorithms.) This is optimal because 
a lower bound fl(nlog?r) has been derived for the time complexity of triangulation of 
a PSLG. 

Algorithms: 

1. Triangulation of a monotone polygon: A monotone polygon P can be triangulated in 
linear time based on the following greedy method. Observe that the monotone polygon P 
is triangulated if nonintersecting edges are added so that no two vertices are visible from 
each other. 

If necessary, rotate the polygon so that it is monotone with respect to the y- axis. 
Sort the vertices of P by decreasing y-coordinate. (This sorting can be done in linear 
time by merging the two monotone chains of P.) Move through the sorted list, and for 
a vertex v, examine each vertex u lower than v, in the sorted order, and add an edge 
between vertices v and u as long as u is visible from v. The edge addition process for 
the vertex v stops at a lower vertex that is not visible from v. Then move to the next 
vertex and perform the edge addition process. Note that once an edge is added between 
vertices v and u, then no vertices between v and u in the sorted list are visible from a 
vertex that is lower than u. Therefore, such vertices can be ignored in the later edge 
addition process. The edge addition process for all vertices can be performed in linear 
time if a stack is used to hold the sorted list. 

2 . Triangulation of a simple polygon: Given a general simple polygon P, partition P in 
time 0(n\ogn) into monotone polygons, then apply the previous linear time algorithm 
to triangulate each monotone polygon. This gives a triangulation of a simple polygon 
in time 0(nlogn). 

3 . Triangulation of a general PSLG: To triangulate a general PSLG G, first add edges 
to G so that each face is a simple polygon (no nonconsecutive edges intersect), then 
apply Chazelle’s linear time algorithm (Fact 5) to triangulate each face. 
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The complexity of Chazelle’s algorithm can be avoided since there is an efficient 
algorithm that adds edges to a PSLG so that each face is a monotone polygon. To do 
this, observe that in a PSLG G every face is a monotone polygon if and only if each 
vertex (except the highest one) has a higher neighbor and each vertex (except the lowest 
one) has a lower neighbor. Thus, to make each face of G a monotone polygon, check 
each vertex of G and for those that do not have desired neighbors, add proper edges to 
them. This process can be accomplished in time 0(n log n) using the plane sweeping 
method [PrSh85] . Now, the simpler linear time algorithm for triangulating a monotone 
polygon (see item 1) is applied to triangulate each face. 

Examples: 

1. The following figure illustrates a simple polygon and two of its triangulations. 



2. In part (a) of the following figure the chain is monotone (with respect to any hori- 
zontal line); the chain in part (b) is not monotone. 




13.5.3 VORONOI DIAGRAMS AND DELAUNAY TRIANGULATIONS 

Definitions: 

Given a set S = {pi , . . . ,p n } in lZ d , the Voronoi diagram Vor (S) of S' is a partition 
of lZ d into n convex polytopes ( Voronoi cells or Dirichlet cells) V(pi ),..., V(p n ) 
such that the region V(pi) is the locus of points that are closer to pt than to any other 
point in S. 

Given the Voronoi diagram Vor(S) of a set S = {pi , . . . ,p n } of points in the plane, the 
straight line dual D(S ) of Vor(S) is a PSLG whose vertices are the points in S and 
two vertices pi and pj in D(S) are connected if and only if the regions V(pi) and V(pj) 
share a common edge. 

The PSLG I)(S) is a triangulation of the set S, called the Delaunay triangulation 
of S. 
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Algorithm 2: Construction of Voronoi diagrams. 

input: a set S of points in the plane 
output: the Voronoi diagram of S 

if \S\ < 4 then construct Vor(S) directly and stop 
else 

partition S into two equal size subsets Sl (left subset) and Sr (right subset) 
separated by a vertical line 
construct Vor(Sx) and Voi(Sr) recursively; 
merge Vor (Sl) and Vor (Sr) into Vor(S); 


Facts: 

1. The Voronoi diagram of a set of n points in the plane can be constructed in time 
0(?rlog n). 

2. The Delaunay triangulation of a set S of points in the plane has the property that 
the circuit with the three vertices of a triangle of the triangulation on its boundary 
contains no other point of the set S. This property makes the Delaunay triangulation 
useful in interpolation applications. 

3. The convex hull problem in the plane can be reduced in linear time to the Voronoi 
diagram problem in the plane: a point p in a set S' is a hull vertex if and only if V ( p ) is 
unbounded, and two hull vertices Pi and pj are adjacent if and only if the two unbounded 
regions V(pi) and V(jpj) share a common edge. Thus, the 0(nlog?i) time algorithm for 
constructing Voronoi diagrams (Algorithm 2) is optimal. 

4. The Voronoi diagram problem for n points in 7Z d can be reduced in linear time to 
the convex hull problem for n points in lZ d+1 [Ed87]. Thus, the Voronoi diagram of a 
set of n points in lZ d can be constructed in time 0(n^ d+1 ^ 2 J) based on the optimal 
algorithm for constructing the convex hull of a set of n points in lZ d+1 . 

Algorithms: 

1. Construction of Voronoi diagrams in the plane: The Voronoi diagram of a set of 
points in 1Z 2 can be constructed using the divide-and-conquer method of Algorithm 2. 

To efficiently partition a set S into a left subset and a right subset of equal size in 
each recursive construction, pre-sort the set S by ^-coordinate. To merge the Voronoi 
diagrams Vor (Sl) and Vor (Sr) into the Voronoi diagram Vor (5), add the part of Vor (5) 
that is missing in Vor(Sx) and Vor (Sr) and to delete the part of Vor (Sl) and Vor (Sr) 
that does not appear in Vor (5). 

2. Voronoi diagrams and geometric optimization problems: An 0(n log n) time optimal 
algorithm can be derived via the Voronoi diagram for the problem of finding for each 
point in a set S of n points in the plane the nearest point in S. This is so because each 
point p in S and its nearest neighbor correspond to two regions in Vor(S’) that share a 
common edge. This also implies an O(nlogn) time optimal algorithm for the problem 
of finding the closest pair in a set of n points in the plane. 

The Voronoi diagram can be used to design an 0(n log n) time optimal algorithm 
for constructing a Euclidean minimum spanning tree for a set S of n points in the 
plane because edges of any Euclidean minimum spanning tree must be contained in 
the Delaunay triangulation D(S) of S. This algorithm implies an 0(n\ogn) time ap- 
proximation algorithm for the Euclidean traveling salesman problem which produces a 
traveling salesman tour of length at worst twice the optimum. 
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Example: 

1 . The left half of the following figure illustrates a set of 6 points in the plane and the 
Voronoi diagram for the set. The right half shows the Delaunay triangulation of the set. 



13.5.4 ARRANGEMENTS 

Definition: 

Given n lines in the plane, the arrangement of the lines in the plane is the PSLG whose 
vertices are the intersection points of the lines and whose edges connect consecutive 
intersection points on each line (it is assumed that all lines intersect at a common point 
at infinity). 

Facts: 

1 . The arrangement of n lines in the plane partitions the plane into a collection of 0(n 2 ) 
faces, edges, and vertices. 

2. The arrangement of n lines can be constructed in 0{n 2 ) time (Algorithm 4), which 
is optimal. 

3 . An arrangement can be represented by a doubly-connected-edge-list in which the 
edges incident with a vertex can be traversed in clockwise order in constant time per 
edge. [PrSh85] 

4 . Applications of arrangements include finding the smallest-area triangle among n 
points, constructing Voronoi diagrams, and half-plane range query. 

5 . The arrangement of n hyperplanes in lZ d can be defined similarly, which parti- 
tions lZ d into 0(n d ) faces of dimension at most d. 

6. Algorithm 3 can be generalized to construct the arrangement of n hyperplanes in lZ d 
in 0(n d ) time, which is optimal. [Ed87] 

Algorithm: 

1 . Constructing the arrangement of a set of lines: Algorithm 3 constructs the arrange- 
ment A of a set H of n lines L\, . . . , L n in the plane by the incremental method. 

To traverse the faces of A that intersect the line Li, start from a face F that has the 
point Pi on its boundary, and traverse the boundary of F until an edge e is encountered 
such that e intersects Li at a point q. A new vertex q is introduced in A and the 
adjacencies of the two ends of the edge e are updated. Then reverse the traversing 
direction on the edge e and start traversing the face that shares the edge e with F, and 
so on. The total number of edges traversed in this process in order to insert the line Li 
is bounded by 0{i). [Mu94] 
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Algorithm 3: Incremental method for constructing the arrangement of a 
set H of n lines. 

input: a set H of n lines Li, . . . , L n in the plane 
output: the arrangement A of the set H 

A := Li; 

for i := 2 to n 

find the intersection point p, of L, and Li 

starting from p,;, traverse the faces of A that intersect Li and update the vertex 
set and edge set of A 


Example: 

1. The following figure shows an arrangement of four lines in the plane. The graph 
has 7 vertices (including the vertex at infinity), 16 edges (of which 8 are unbounded), 
and 11 regions (of which 8 are unbounded). 



13.5.5 VISIBILITY 


Visibility problems are concerned with determining what can be viewed from a given 
point (or points) in the plane or three-dimensional space. [0’R87], [0’R93], [Sh92] . 

Definitions: 

The visibility problem is the problem of finding what is visible, given a configuration 
of objects and a viewpoint. 

Given n nonintersecting line segments in the plane, the visibility graph is the graph 
whose vertices are the endpoints of the line segments, with two vertices adjacent if and 
only if they are visible from each other; i.e., there is an edge joining a and b if and only 
if the open line segment ab does not intersect any other line segments. 

A star polygon is a polygon with an interior point p such that each point on the 
polygon is visible from p. 

Facts: 

1. Visibility problems have important applications in computer graphics and robotics 
and have served as motivation for research in computational geometry. 
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2 . Constructing the visibility graph for a set of n nonintersecting line segments is a 
critical component of the shortest path problem in the plane. The visibility graph 
problem can be solved in optimal time 0(n 2 ) [Ya90]. 

3 . Art gallery theorems: Given a simple polygon with n vertices, 

• there is a set S of [|J vertices of the polygon such that each point on or inside 

the polygon is visible from a point in S'; 

• there is a set S of [|] points on the polygon such that each point on or outside 

the polygon is visible from a point in S; 

• there is a set S of |~|] vertices of the polygon such that each point on, inside, or 

outside the polygon is visible from a point in S. 

In each case the number given is the best possible. 

Algorithms: 

1. Given n line segments in the plane, compute the sequence of subsegments that are 
visible from the point y = — oo, that is, by using parallel rays. The problem can be solved 
by a modified version of the plane sweeping algorithm for computing all intersection 
points of the line segments. The algorithm has worst case time complexity 0(n 2 log n). 

2. An alternative algorithm is based on the divide-and-conquer approach: arbitrarily 
partition the set of the n line segments into two equal size halves, solve both subprob- 
lems, and merge the results. 

Note that the merging step amounts to computing the minimum of two piecewise 
(not necessarily continuous) linear functions, which can be easily done in time linear 
to the number of pieces if it is recursively assumed that the pieces are sorted by x- 
coordinate. For a set of n arbitrary line segments in the plane, in the worst case, the 
number of subsegments visible from the point y = — oo is bounded by 0(na(n )) [Ya90], 
where a(n), the inverse of Ackermann’s function (§1.3.2), is a monotonically increasing 
function that grows so slowly that for practical purposes it can be treated as a constant. 
Therefore, the merging step runs in time 0(na(n)). Consequently, the time complexity 
of the algorithm based on the divide-and-conquer method is 0(na(n) logrc). 

3 . Three-dimensional visibility: Given a set of disjoint opaque polyhedra in 7Z 3 , find 
the part of the set that is visible from the viewpoint z — — oo (that is, with parallel rays). 
The problem can be solved in time 0(n 2 logn) by a modified plane sweeping algorithm 
for computing the intersection points of the line segments that are the projections of the 
edges of the polyhedra on the cry-plane. Optimal algorithms of time complexity 0(n 2 ) 
have been developed based on line arrangements. [Do94], [Ya90] 

Examples: 

1. The following figure shows three line segments and the visibility graph with six 
vertices determined by the line segments. The edges of the visibility graph are shown 
as dotted lines. 
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2 . Surveillance problems: A variety of problems require that “guards” be posted at 
points of a polygon so that corners and/or edges are visible. (See the art gallery theorems 
of Fact 3.) 

3 . Hidden surface removal: An important problem in computer graphics is the problem 
of finding (and removing) the portions of shapes in three-dimensional space that are 
hidden from view, when the object is viewed from a given point. (See the Three- 
dimensional visibility algorithm.) 


13.6 GEOMETRIC DATA STRUCTURES AND SEARCHING 

This section describes the use of data structures for searching, or querying, among a 
set S of geometric objects. For each of the following problems, there are algorithms 
that perform a single search in time proportional to n, the total complexity of all the 
geometric objects in S. These single-search algorithms use minimal data structures and 
minimal preprocessing time. When the application searches multiple times among the 
elements of the same set S, it becomes more efficient to preprocess the objects in S into 
a data structure that would allow a faster searching procedure. 

This section presents four fundamental searching problems in computational ge- 
ometry for which clever data structures reduce the search time to 0 (log 2 n), where k 
is a small constant (often equal to 1 for problems in the plane and three-dimensional 
space). This section covers only the static versions of these problems; that is, S never 
changes. The dynamic versions allow deletions from S and/or insertions into S in be- 
tween queries. In the dynamic versions of the problem, in addition to polylogarithmic 
query time, the goal is to keep the update time polylogarithmic. The dynamic versions 
are, as a rule, much more difficult. 


13.6.1 POINT LOCATION 
Definition: 

Let p be a point and S a subdivision of 1Z d . ( S can be a single geometric object, such 
as a polytope, or can be a general subdivision of lZ d .) The point location problem 
is the problem of determining which region of S contains p. 

Examples: 

1. Locate a point in the subdivision of the space induced by an arrangement of a set 
of hyperplanes. (See §13.5.4.) 

2 . Locate a point in a subdivision all of whose regions are convex (a convex subdivision) . 
For example, an arrangement of hyperplanes is a convex subdivision. 

3 . Search for nearest neighbors in Voronoi diagrams. (See §13.5.3.) 

4 . Range searching. (See §13.6.2.) 

5 . Ray shooting. (See §13.6.3.) 
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Algorithms: 

1. Point location in straight-line subdivision using triangulation hierarchy: Given an 
n- vertex triangulation R! = (V',T'), where V' is the set of vertices and T' is the set 
of triangles, an (?r+3)-vertex enclosed triangulation R = (V, T ) is the triangulation R’ 
together with the triangulation of the region between U and the convex hull of R', 
where U = ( Vu,Tu ) is a triangle that contains R' in its interior. The triangulation- 
hierarchy of [Ki83] consists of a sequence of triangulations 1Z = (Ri, R 2 , . . . , i? c i og2 n), 
where i?i = f?, i? c i og2 n = U, and Ri is created from Ri-\ as follows (illustrated in the 
following figure): 

• remove from V-i — Vjj a set X of independent (that is, nonadjacent) vertices 

and remove from T)_i the set Z of all triangles incident with any vertex in X: 
Vi = Vi - 1 -X,Ti = Ti-i - Z; 

• retriangulate any polygons in Ri = (V),Tj). 

Part (a) of the following figure shows triangulation R\ (the vertices that are removed 
from Ri are circled). Part (b) shows triangulation f? 2 (dotted lines are edges in re- 
triangulation). Part (c) gives a list of the pointers from triangles in R 2 to triangles 
in f?i. 




(°) 'l4 


Algorithm 1 produces a hierarchy for planar subdivision. With minor modifications 
(for example, “triangles” become tetrahedrons), it can be used for subdivisions in 1Z 3 . 
It can be proven that |T c i og2n | = 1 for some constant c, and that |r(f)| is a constant for 
every t. 

This algorithm runs in O(n) time and produces a triangulation hierarchy that 
takes 0{n) space. Algorithm 2 takes 0( log 2 ri) time. 


V *9> W *11 

*5’ *12 

*5’ *6’ *7’ *12’ *13 

t,. 
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Algorithm 1 : Computing the triangulation hierarchy. 

input: planar straight-line subdivision S 
output: triangulation hierarchy of S 

compute triangulation R! of S\ compute enclosed triangulation R of R!\ choose a 
small constant k\ R\ := R; i := 1 
while | Ti| > 1 

i := i + 1; i?,; := mark all vertices in V) having degree < k 

while there exists some marked vertex v 

P := (Vp,Ep) (polygon consisting of vertices adjacent to v. that is, 

V P := {vj | (v,vj,Vk) G Tj}; E P := {(v jy v k ) \ (v,vj,v k ) G Pi}) 
remove v from V,;. remove all the triangles incident with v from T); that is 
Prem - = { Vj 5 V k ) | Vj G Vp }; Pi '.— Pi Trem 
compute the triangulation Rp of P 
for each triangle t in Rp 

r{t) := the set of triangles in P rem that overlap with t 
create a pointer from t to every triangle in r(t) {See part (c) of figure.} 
unmark v and any marked vertices in Vp. 


Algorithm 2: Performing point location. 

input: a point q and a triangulation hierarchy 1Z 
output: triangle that contains q 

check if R c iog 2 n contains q 
i := c log 2 n — 1; t := U 

while i > 1 

determine the triangle i! in r(t) that contain q using pointers from t to r(t) 
t := t'\ i := i — 1. 


2. The following table shows the complexity of various point location algorithms. The 
number m denotes the number of regions (or cells) in the subdivision S; n denotes the 
total combinatorial complexity of S. 


dimension 

subdivision type 

query 

time 

space 

preprocessing 

time 

2 

convex subdivision 

0{ log 2 n ) 

0(n) 

0(n) 

3 

simple polytope 

0(log 2 n) 

0(n) 

0(n ) 

3 

convex subdivision 

0(log 2 n) 

0(n log 2 n) 

0(n log 2 n) 

d 

arrangement of n 
hyperplanes 

0(log 2 n) 

0(n d ) 

0(n d ) 

d 

subdivision of m 
{d— l)-simplices 
with a total of n 
faces, e > 0 

0(log 2 to) 

0(m d ~ 1+e + n ) 

0(m d ~ 1+e 
+ n log 2 to) 
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13.6.2 RANGE SEARCHING 


Definitions: 

The range counting problem is the problem of counting the number of points in a 
given set S C lZ d that lie in a given query range q. 

The range reporting problem is the problem of determining all points in a given 
set S C lZ d that lie in a given query range q. 

The range emptiness problem is the problem of determining if a given query range q 
contains any points from a given set S C lZ d . 

Facts: 

1. The following table gives information on various range searching algorithms. The 
integer n is the number of points in S; e is an arbitrarily small positive constant. When 
the query is reporting, the query time has an additive factor of k, which is the size of 
the output. 


dim 

range type 

query time 

space 

preprocessing 

time 

2 

orthogonal 

0( log 2 n) 

0(n log 2 +e n) 

0(n log 2 n) 

2 

convex polygon 

0( v / nlog 2 n) 

0{n) 

0(n 1+e ) 

2 

convex polytope 

0(n 2 / 3 log 2 n) 

0{n log 2 n) 

0(n 1+e ) 

d 

convex polytope 
for n < m < n d 

0{ log 2 +1 n) 

0(n 1_1/d ) 

0((n/m}/ d ) log 2 +1 n) 

0(n d ) 

0{n) 

0(m 1+e ) 

0(n d (log 2 n) e ) 

0(n 1+e ) 

0(m 1+e ) 

d 

half-space 
for n < m < n d 

0(log 2 n ) 

0(n/m 1 / d ) 

0{n d / log 2 n) 
0(m ) 

0(n d / log 2 _e n) 
0(n 1+e +m( log 2 n) e ) 


2. The following table gives information on various range reporting algorithms. 


dim 

range type 

query time 

space 

preprocessing 

time 

2 

half-plane fixed- 
radius circle 

0(log 2 n + k) 

0(n ) 

0(nlog 2 n) 

2 

orthogonal 

0(log 2 n + k) 

0(n log 2 n) 

0(nlog 2 n) 

3 

half-space 

0(log 2 n + k) 

0(n log 2 n) 

0(n\ogln\og 2 log 2 n) 

d 

half-space 
n < m < n^J 

0(log 2 n + k) 
0(n 1-1 /L2.l+ e -|-fc) 
oi m i /d log 2 n+k) 

0(nL5l +e ) 

O(n) 

0(m 1+e ) 

0( n LfJ+ e ) 

0(n) 

0(m 1+e ) 

d 

orthogonal 

0( log 2 _1 n + k) 
0(dn 1 ~i + k) 

i d-l 

Q,nlog 2 ns 
i log 2 log 2 n > 

0{dn) 

0(n log 2 _1 n) 

0(dn log 2 n) 
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Examples: 

1. Orthogonal range search: The query range q is a cartesian product of intervals on 
different coordinates axes. 

2. Bounded distance search: The query range q is a sphere in lZ d . 

3. Other typical search domains are half-spaces and simplices. 

4. Machine learning : Points are labelled as positive or negative examples of a concept, 
and range query determines the relative number of positive and negative examples in 
the range (thus enabling the range to be classified as either positive or negative example 
of the concept). 

5. Multikey searching in databases: Records identified by a d-tuple of keys can be 
viewed as a point in TZ d , and the range query on records corresponds to orthogonal 
range query. 


Algorithm: 

1. Orthogonal range searching in 1Z 2 using range trees: The range tree is defined 
recursively by Algorithm 3. Each node stores a subset of point organized into a threaded 
binary search tree by the ^-coordinates of the points. The left child contains half the 
parent’s points, in particular those with lesser ^-coordinates; the right child contains 
the other half of the parents’ points with greater x-coordinates. See the following figure. 



Each node also stores the range of x-coordinates of its points. For simplicity, all 
coordinates of all points are assumed to be distinct. It is also assumed that all points of 
S = {(xi,yi), (x’ 2 , 2 / 2 ), • • • , ( x n ,y n )} have been presorted by their x-coordinate so that 
x\ < x 2 < • • • < x n . 

Orthogonal range reporting proceeds as follows down the range tree. If the range 
of the current node x is a subset of the x range of the query, then all the points in 
the node’s binary search tree with y-coordinate in the y range of the query are output. 
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Algorithm 3: Computing the range tree. 

procedure RangeTree{S = { (£1,2/1), (£212/2), • ■ • , ( x n ,y n ) }: set of points, 
T : pointer to root of a range tree) 

if S' = 0 then return 
else store the interval [xi,x n ] in T.int 
store Binary S ear chTree(S) in T.y 
RangeTree({(xi,yi), ..., (x«,y«)},T. left, child) 
RangeTree({(x* + i,y*+i ), . . . , (x n ,y n )},T.\ right. child) 

procedure Binary S ear chTree(S' = { (£1,2/1), ■ • • , (x n ,y n ) }: set of points) 
sort the points of S' by 2/-coordinate so that y\ < 1/2 < ■ ■ • < y n 

create a threaded balanced binary search tree B for S': 
store point ( Xi,yi ) in the itli leftmost leaf 
ii.next := £ i+ i {connect the leaves into a linked list} 
li.key := y t 

for each node v, v.key := min{ li.key \ li € subtree{v .right-child) } 


Algorithm 4: Orthogonal range reporting using range trees 

procedure Ortho RangeS ear ching{q = [xi,£ 2] x [2/1, 2/2]: rectangle in the plane, 
T : pointer to root of range tree) 
if T = NIL then return 

else if T.int C [£i,£ 2 ] then Search All {T.y, [221,2/2]) 
if [£i,£2]n e T.left-child.int yf 0 then 
OrthoRangeSearching{q 1 T.le ft _ child ) 
if [£i,£2]n € T.right-child.int y^ 0 then 
OrthoRangeSearching{q 1 TV rights child) 

procedure SearchAll{v: pointer to root of binary tree, [221,2/2]: query interval) 
while v is not a leaf 

if 2/1 < v.key then v = v. left- child 
else v := v.right_child 
if v.key < 2/1 then v = v.next 
while v NIL and v.key < 2/2 
output point stored at v 
v := v.next 


If the £ range of the query overlaps the x range of the left child, then the algorithm 
proceeds recursively to the left child. If the x range of the query overlaps the x range 
of the right child, then the algorithm proceeds recursively to the right child. 

The running time of Algorithm 3 is 0(nlog2n) and the space taken by the range 
tree is 0(nlog 2 n). The running time can be improved by a log 2 n factor. Essentially 
the same procedure can be used to build range trees in any dimension. Algorithm 4 
takes 0 { log 2 n + k) time, where k is the number of reported points. This running time 
can be improved to 0 { log 2 n + k). 
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13.6.3 RAY SHOOTING AND LINES IN SPACE 


Definitions: 

A ray r is a half-line that is directed away from its endpoint; that is, it satisfies the 
equation r = p + Xv, A > 0, where p is the starting point of r and v is the direction 
of r. 

Given a set S of geometric objects in lZ d and a query ray r, the ray shooting problem 
is the problem of determining the first object in S that is hit by r, that is, the object 
s £ S whose intersection with r is closer to p than the intersection between r and any 
other object in S. 

A polyhedron is axis-parallel if each of its edges is parallel to a coordinate axis. 


Facts: 

1. The following table gives information on various ray shooting algorithms. 


dim 

subdivision type 

query 

time 

space 

preprocess- 
ing time 

2 

simple polygon 

0( log 2 n) 

0(n) 

0(n ) 

2 

line segments 

0{ log 2 n) 
0(v / nlog 2 n) 

0(n 2 a 2 (n)) 
0{n log 2 n) 

0(n 2 a 2 (n)) 
0 (nlog 2 n) 

3, fix p 

axis-parallel polyhedra 

0( log 2 n ) 

0(n\og 2 n) 

0{n\ og 2 n) 

3, fix p 

polyhedra 

0( log 2 n) 

0(n 2 a(n)) 

0(n 2 a(n)) 


for any e > 0, 
n < in < nr 

0(n 1+e / y/m) 

0(m 1+e ) 

0(m 1+e ) 

3, fix v 

axis-parallel polyhedra 

0{ log 2 n x 
(log 2 log 2 n) 2 ) 

0{n log 2 n) 

0(n log 2 n ) 


for any e > 0 

0( log 2 n) 

0(n 1+e ) 

0(n 1+e ) 

3, fix v 

polyhedra 

0{ log 2 n ) 

0(n 3+e ) 

0(n 3+e ) 


for any e > 0, 
n < to < n 3 

0(n 1 +£ /to 1 / 3 ) 

0{m 1+t ) 

0(m 1+e ) 

3 

axis-parallel polyhedra 

0{ log 2 n) 

0(n 2+e ) 

0(n 2+e ) 

3 

polyhedra 

0{ log 2 n) 

0(n 4+e ) 

0(n 4+e ) 


2 . Applications of ray shooting include hidden surface removal, visibility questions and 
ray tracing in computer graphics, and computing shortest paths in presence of obstacles 
in robotics. 


Examples: 

1. S can be a single object, such as a simple polygon in the plane. 

2. S can be a collection of objects, such as a set of polyhedra in three-dimensional 
space. 
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Algorithm: 


1. Ray shooting from a fixed point among planar nonintersecting segments: For sim- 
plicity, assume that the fixed point p in the plane is at the origin. Define two relations 
using the same notation: for two points qj and Qk , qj -< qu if Qj makes a smaller polar 
angle with respect to the origin than does q^\ for two nonintersecting segments Sj and 
Sk, Sj -< Sk if for every ray r that starts at the origin and crosses both Sj and Sk, r 
crosses Sj before crossing s*,. Segment (qj,qk) starts at qj and ends at q f. if qj -< qk- A 
null segment is denoted that is, a query ray hitting Soo does not intersect any of 
the given segments. 

Algorithm 5, VisibilityMap , creates an array X of nonoverlapping angle intervals, 
sorted by their polar angle, with the property that consecutive entries in X have different 
“smallest” segments according to relation. 

This algorithm uses a technique called sweep-plane: the algorithm sweeps the polar 
coordinates originating at p with a ray, stopping the sweep-ray at all the angles where the 
sweep-ray intersects a segment endpoint. The set S' stores all the segments intersected 
by the current sweep-ray; S' is organized as a binary search tree ordered by the 
relation on the segments. When the sweep-ray encounters a segment endpoint that 
starts a segment, the segment is added to S'; when the sweep-ray encounters a segment 
endpoint that ends a segment, the segment is removed from S'. At every stop of the 
sweep-ray, if the smallest (under the relation) segment of S' is different from the 
sweep-ray’s last stop, a new interval is added to X. See the following figure. 

The thick lines in this figure are the segments in S and the thin lines are the 
boundaries between intervals in the visibility map for S. The intervals are labeled by 
their names. 



Algorithm 5, VisibilityMap, takes 0(n log 2 n) time and can be used for ray shooting 
among simple polygons. The visibility map consists of array X and takes 0(n) space. 
The problem is harder if the segments are allowed to intersect. Algorithm 6, Rav Shoot, 
takes 0(log 2 n) time. 
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Algorithm 5: Computing visibility map. 

procedure Visibility Map(p: fixed origin point, S: set of n segments) 

sort endpoints of all segments by their polar angles so that qi -< <72 A • • • A q 2 n 

no-of -intervals := 0; S' := (sqo) 

{S' is a binary search tree containing segments ordered by relation} 

for i = 1 to 2n 

first := the “smallest” (under the relation) segment in S' 
if qt starts a segment Sj then insert Sj into S' 
if qi ends a segment Sj then remove Sj from S' 
if first 7 ^ smallest segment in S' then 
no-of -intervals := no-of -intervals + 1 
I[no-of -intervals\.angle := polar angle of qi 
T[no-of -intervals]. name := the smallest segment in S' 


Algorithm 6 : Ray shooting using the visibility map. 

procedure RayShoot{r = (p,v): query ray, Z: visibility map) 
consider v as a polar angle 

do a binary search among Z{*\. angle to find the segment Z[k\.name such that 
2[k\. angle < v and Z[k + l}. angle > v 


1 3.7 COMPUTATIONAL TECHNIQUES 


This section describes some techniques used in the design of geometric algorithms. 


13.7.1 PARALLEL ALGORITHMS 

The goal of parallel computing is to solve problems faster than would be possible on 
a sequential machine — through the use of parallel algorithms. The complexity of a 
parallel algorithm is given in terms of its time and the number of processors used [Ja92]. 
The work of a parallel algorithm is bounded above by the processor-time product, the 
product of the number of processors and the time. 

Definitions: 

A parallel algorithm is an algorithm that concurrently uses more than one processing 
element during its execution. 

A parallel machine is a computer that can execute multiple operations concurrently. 

A parallel random access machine {PRAM) is a synchronous machine in which 
each processor is a sequential RAM and processors communicate using a shared mem- 
ory. Depending upon whether concurrent accesses are allowed to the shared memory 
cells, a PRAM is either exclusive read ( ER ) or concurrent read ( CR ), and either 

exclusive write ( EW ) or concurrent write (CW). 
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Algorithm 1: ConvexHul^S 1 : a set presorted by x-coordinate). 

if | S' | <3 then construct CH(S) directly 
else 

partition S into two subsets of equal size, Sl (the left subset) and Sr (the 
right subset), which are separated by a vertical line 
in parallel, recursively construct CH(Sl) and CH(Sr) 

construct CH(S) from CH(Sl), CH(Sr), and common tangents between them 


Facts: 

1. Parallel divide-and-conquer: Parallel algorithms can be obtained using the divide- 
and-conquer paradigm. The subproblems resulting from the divide phase are solved 
in parallel, and the combining phase of the algorithm is parallelized. In traditional 
divide-and-conquer algorithms, the problem is partitioned into two subproblems. 

2. Many-way divide-and-conquer: Sometimes faster parallel algorithms can be ob- 
tained by partitioning the original problem into multiple, smaller subproblems, often 
referred to as many-way divide-and-conquer [Ja92]. The solution to the original problem 
is obtained from the solutions to the subproblems. 

3. Cascading divide-and-conquer: Some divide-and-conquer algorithms can be speeded 
up by pipelining (cascading) the work performed in the recursive applications as follows. 
Consider a binary tree representing the solutions to the recursive computations of the 
original divide-and-conquer algorithm, where leaves represent terminal subproblems. To 
obtain a faster algorithm, information is passed from a child to its parent before the 
solution to the child’s subproblem is completely known. The parent then does some 
precomputation so that the solution to its problem can be computed as soon as the 
solutions of both of its children’s problems are available. Typically, the information 
passed from a child to its parent is a constant sized sample of the child’s current state. 
Often, such algorithms run in time proportional to the height of the recursion tree. 

4. Cole first used pipelining to design a work-optimal O(logn) time parallel version of 
merge sort. Atallah, Cole, and Goodrich used this strategy to solve several geometric 
problems including the three-dimensional maxima problem and computing the visibility 
of a polygon from a point [AtGo93] . 

Algorithms: 

1 . Using parallel divide-and-conquer to compute the convex hull CH(S) of a set S of n 
points in the plane: In Algorithm 1 the points in S are presorted by ^-coordinate so 
that the division in the second step can be accomplished in constant time. The sorting 
takes O(logn) time using 0(n log ?r) work [Ja92], The tangents needed in the third step 
can be computed in constant time using | Sr + Sr processors on a CREW PRAM 
[AtGo93]. Thus, as there are O(logn) recursive calls, Algorithm 1 runs in O(logn) time 
using 0{n log n) work, which is both worst-case work-optimal and time-optimal for the 
CREW PRAM. 

2. Computing the convex hull of a three-dimensional point set: The convex hull of 
a three-dimensional point set can be computed using a two-way divide-and-conquer 
algorithm similar to Algorithm 1. The running time of this algorithm is 0(log 2 n ) 
because the combining in the last step of Algorithm 1 takes O(logn) time since it is 
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more complex than in the planar case. However, a faster algorithm can be designed 
using many-way divide-and-conquer. The point set is partitioned into 0{n 2 ) groups, 
each of size 0(n 2 ), and then, even though the combining still takes 0(log?i) time, a 
total time of O(logn) can be obtained. 


13.7.2 RANDOMIZED ALGORITHMS 

Randomization is a powerful technique that has been used in algorithms for many 
geometric problems. 

Definition: 

A randomized algorithm is an algorithm that makes random choices during its exe- 
cution. 

Facts: 

1. Randomized algorithms are often faster, simpler, and easier to generalize to higher 
dimensions than deterministic algorithms. 

2 . For many problems, efficient algorithms can be obtained by processing the input 
objects in a particular order or by grouping them into equal sized subsets. Although 
significant computation may be required to exactly determine an appropriate order or a 
good partition into subsets, in many cases simple random choices can be used instead. 

3 . Randomized incremental methods: One of the simplest ways to construct a geo- 
metric structure is incrementally. In an incremental construction algorithm, the input 
objects are inserted one at a time and the current structure is updated after each ad- 
dition. The desired structure is obtained after all input objects have been inserted. 
Although the cost of some updates could be very large, in many cases it can be shown 
that if the objects are inserted in random order, then the amount of work for each 
update is expected to be small and the expected running time of the algorithm will 
be small as well. Thus, the expectation of the running time depends not on the input 
distribution but on the ratio of good to bad insertion sequences. 

4 . The power of randomization in incremental algorithms was first noted by Clarkson 
and Shor, and by Mulmuley. Randomized incremental algorithms have been proposed 
for constructing many geometric structures including convex hulls, Delaunay triangula- 
tions, trapezoidal decompositions, and Voronoi diagrams [Mu94]. 

5 . Randomized divide-and-conquer: In randomized divide-and-conquer a random sub- 
set of the input is used to partition the original problem into subproblems. Then, as in 
any divide-and-conquer algorithm, the subproblems are solved, perhaps recursively, and 
their solutions are combined to obtain the solution to the original problem. Ideally, the 
partition produces subproblems of nearly equal size and the sum of the subproblem sizes 
roughly equals the input size. This ideal can almost be achieved for many geometric 
problems. For example, with probability at least 0(r) subproblems of size 0( n ° gr ) 
can often be obtained, where n is the number of input objects and r is the size of the 
random subset. 

6. The fact that a random sample of the input items can often be used to produce sub- 
problems of almost equal size was first shown by Clarkson, and by Haussler and Welzl. 
Randomized divide-and-conquer algorithms are known for many geometric problems in- 
cluding answering closest-point queries, constructing arrangements, triangulating point 
sets, and constructing convex hulls [Mu94]. 
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Algorithm 2: Intersect^: a set of n halfplanes). 

R := a random sample of size r chosen from S. 

compute the intersection I(R) and triangulate it by connecting each vertex of 
I(R) to the origin to obtain T(R ) 

for each triangle t € T(R), determine its conflict list c(t) (the set of halfplanes 
in S whose bounding lines intersect t) 

for each triangle t £ T(R ), compute the intersection of the halfplanes in c(t) re- 
stricted to t {This may be done recursively} 


Examples: 

1 . Incrementally constructing the intersection of a set H of n halfplanes in the plane: 
First, a random permutation (hi, h 2 , . . . , h n ) of H is formed. Let R denote the inter- 
section of halfplanes h\ through hi. During the ith iteration of the algorithm, R is 
computed from R_i by removing the portion of R- 1 that is not contained in hi. 

To make this update easier, a vertex of the current intersection R_i that is not 
contained in hj is maintained, for all j > r, such a vertex is said to conflict with hj . Given 
a conflicting vertex v for hi, the portion of 1 that must be removed is determined 
by traversing its boundary in both directions from v until reaching vertices that are 
contained in hi. Since I n has size 0(n), the amortized cost of each update is 0(1) 
and the algorithm spends a total of O(n) time updating the intersection. After R is 
computed, the conflicting vertex for hj is updated, for all j > i. It can be shown that 
the total cost of maintaining the conflicting vertices in the algorithm is O(nlogn). 

2. Using randomized divide-and-conquer to construct the intersection I(S) of a set S 
of n halfplanes, each of which contains the origin: See Algorithm 2. In Step 2, the 
intersection of the halfplanes in the sample is used to create 0(r ) triangles, each of 
which corresponds to a subproblem. For technical reasons, the region corresponding to 
each subproblem should have constant descriptive complexity. 

Often, this condition is achieved by triangulating the resulting structure. Next, 
the input is distributed to the subproblems. Usually, this is done by finding the objects 
that intersect, or conflict with, a subproblem’s region. 

Then, the subproblems are solved, recursively if desired, and their solutions are 
combined to form the solution to the original problem. In the intersection algorithm 
the final solution is obtained by “gluing” the subproblem solutions together. 

If r is some suitably chosen constant, then Algorithm 2 runs in expected time 
0(?rlog n). 


13.7.3 PARAMETRIC SEARCH 
Definition: 

Parametric search is an algorithmic technique for solving optimization problems by 
using an algorithm for solving an associated decision problem as a subroutine. 
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Algorithm: 

1. Parametric search is a powerful algorithmic technique that can be applied to a 
diverse range of optimization problems. It is best explained in terms of an abstract 
optimization problem. Consider two versions of this problem: 

• a search problem P , which can be parametrized by some real, nonnegative value t 

and whose solution is a value t* that satisfies the optimization criteria; 

• a decision problem D(t ), whose input includes a real, nonnegative value t and 

whose solution is either YES or NO. 

In parametric search, the search problem P is solved using an algorithm A s for the 
decision problem D(t). In order to apply parametric search, the points corresponding 
to YES answers for D{t) must form a connected, possibly unbounded, interval in [0, oo). 
Then, assuming that D( 0) = YES, the search problem P is to find the largest value of t 
for which D(t) = YES. 

The basic idea of parametric search is to use A s to find t* . This is done by simulat- 
ing A s , but using a variable (parameter) instead of a value for t. Assume that A s can 
detect if t = t*, and that the computation in A s consists of comparisons, each of which 
tests the sign of a bounded degree polynomial in t and in the original input. During 
each comparison in the simulation: 

• the roots of the appropriate polynomial are computed, 

• A s is run on each root, 

• t* is located between two consecutive roots. 

Thus, each comparison in the simulation reduces the interval known to contain t* , which 
is originally [0,oo). 


Facts: 

1. In most geometric applications it can be shown that at some point the simulation 
will test a root that is equal to t*. 

2. If T s denotes the worst-case running time of A s , the total cost of the parametric 
search is 0(T S 2 ) since there are 0(T S ) operations in the simulation of A s , and each 
comparison operation in this simulation takes 0(T S ) time. 

3. With a parallel algorithm A p to solve the decision problem D(t), the (sequential) 
parametric search can be done faster. Suppose that A p runs in T p parallel time steps 
using p processors. Then, A p performs at most p independent comparison operations 
during each parallel time step. The parametric search simulates A p sequentially as 
follows. To simulate a parallel time step of A p : 

• compute the 0(p) roots of the at most p bounded degree polynomials correspond- 

ing to independent comparisons in the parallel time step; 

• sort these roots; 

• use a binary search to locate t* between two consecutive roots. (The sequential 

algorithm A s is used to evaluate the roots during the binary search.) 

The cost of simulating each parallel time step is 0(p\ogp + T s logp), and the total cost 
of the parametric search is 0{T p {p\ogp + T s logp)). 

4. Parametric search was originally proposed by Megiddo. It has been used for many 
geometric problems including ray shooting, slope selection, and computing the diameter 
of a three-dimensional point set [Mu94] . 
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Example: 

1. Answering ray shooting queries in a convex polytope Q : In the search problem P, 
a ray r is given originating at a point p contained in Q , and the face of Q hit by r needs 
to be found. The decision problem D(t) is the polytope membership problem: given a 
point t , determine if t lies in Q. The value t* sought by the parametric search is the 
point where r intersects Q. The ray shooting problem can be solved in 0( log 2 n) time 
by parametric search [Mu94] (since, given an appropriate data structure, the polytope 
membership problem can be solved sequentially in O(logn) time). 


13.7.4 FINITE PRECISION 

Geometric algorithms are usually designed assuming the real random access machine 
model of computation. In this model, values can be arbitrarily long real numbers, and 
all standard operations such as +, — , x, and 4- can be performed in unit time regardless 
of operand length. In reality, however, computers have finite precision and can only 
approximate real numbers. Several techniques have been suggested for dealing with 
this problem [Fo93]. 

Facts: 

1. Algorithms directly designed for a discrete domain: This approach can dramatically 
increase the complexity of the algorithm and has not gained wide use. 

2. Floating point numbers: Floating point numbers provide a convenient and effi- 
cient way to approximate real numbers. Unfortunately, naively rounding numbers in 
geometric algorithms can create serious problems such as topological inversions. There 
are cases when floating point arithmetic can safely be used. For example, for certain 
inputs the result of the floating point operation will be unambiguous. Some algorithms 
have been shown to be sufficiently stable using floating point arithmetic. However, no 
general method is known for designing stable algorithms. 

3. Exact arithmetic: In exact arithmetic, numbers are represented by vectors of inte- 
gers and all primitive operations are guaranteed to give correct answers. Integer arith- 
metic is sufficient for many geometric algorithms since symbolic or algebraic numbers 
are rarely needed, and in many cases homogeneous coordinates can remove the need for 
rational numbers. However, since exact representations can have large bit complexity, 
exact arithmetic can be expensive — typically increasing the cost of arithmetic oper- 
ations by an order of magnitude. This cost can be decreased somewhat by optimizing 
the expressions and computations involving exact arithmetic. 

4. A combination of floating point and exact arithmetic: First, the operation is per- 
formed using floating point arithmetic. Then, if the result is ambiguous, an exact 
computation is performed. 

5. Adaptive-precision arithmetic: In adaptive-precision arithmetic each number is 
approximated by an interval whose endpoints require lower precision. If the exact 
computation using the approximation is ambiguous, the method iterates using smaller 
and smaller intervals with higher precision endpoints. 

6. Although no clear consensus has been reached, a combination of the above strategies 
may yield the best results. 
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13.7.5 DEGENERACY AVOIDANCE 


To simplify exposition by reducing the number of cases that must be considered, many 
geometric algorithms assume that the input is in general position. The general position 
assumption depends on the problem. For example, in problems involving planar point 
sets, the assumption might be that no two points have the same ^-coordinate, that no 
three points lie on the same line, or that no four points lie on the same circle. 

Definitions: 

A set of objects is in general position if the objects satisfy certain specified conditions. 
A set of input objects that violates the general position assumption is degenerate. 

Facts: 

1 . Perturbation: Several schemes have been proposed that apply small perturbations 
to transform the input so that it does not contain degeneracies [Fo93]. The object of 
perturbation schemes is to allow the design of simpler algorithms which may validly 
assume the input is in general position. 

Algorithms using perturbation schemes may not always produce correct output. 
For example, in a convex hull algorithm, points on the boundary could potentially be 
perturbed into the interior of the polytope and vice versa. Also, perturbation schemes 
can affect, perhaps adversely, output-sensitive algorithms whose running times depend 
on the size of the output they produce. 

2 . Deal with degeneracies directly : Algorithms that deal with degeneracies in a prob- 
lem specific manner have been designed for problems such as triangulating point sets. 
Although this approach is not as general and usually leads to more complex algorithms 
than those employing perturbation schemes, it can provide superior performance. 

3 . Symbolically perturbing the coordinates of each input point (Edelsbrunner and 
Miicke): This is done by adding to each coordinate a suitable power of a small, pos- 
itive real number represented by a symbolic parameter e. Then, since values are now 
polynomials in e, the arithmetic operations in the algorithm are replaced by polynomial 
arithmetic operations. Geometric primitives are implemented symbolically, typically by 
evaluating a sequence of determinants. Assuming that the dimension of the problem is 
fixed, this scheme increases the running time of the algorithm by at most a constant 
factor. However, the overhead incurred can in fact be quite large. 


13.8 APPLICATIONS OF GEOMETRY 


Geometry overlays the entire computing spectrum, having applications in almost every 
area of science and engineering, including astrophysics, molecular biology, mechanical 
design, fluid mechanics, computer graphics, computer vision, geographic information 
systems, robotics, multimedia, and mechanical engineering. 
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Some excellent general references to geometry applications include a report on 
application challenges to computational geometry at the website 

http : //www. cs .princeton . edu/~chazelle/taskf orce/CGreport . ps 

a status report on theoretical computer science, including applications of computational 
geometry, at the website 

http : //hercule . csci .unt . edu/ sigact/longrange/contributions .html 
Eppstein’s list of general geometric references at the website 
http : //www. ics .uci . edu/~eppstein/geom. html 

and Amenta’s directory of software at the website 
http : // www . geom . umn . edu/ software/ cglist/ 

including planar algorithms, arbitrary dimensional convex hull, Voronoi diagram, Delau- 
nay triangulation, polygon decomposition, point location, intersection, linear program- 
ming, smallest enclosing ball and center point, visualization, mesh generation, shape 
reconstruction and collision in robotics. 


13.8.1 MATHEMATICAL PROGRAMMING 
Definition: 

Mathematical programming is the large-scale optimization of an objective function 
(such as cost) of many variables subject to constraints, such as supplies and capacities. 

Facts: 

1. Mathematical programming includes both linear programming (continuous, integer, 
and network) and nonlinear programming (quadratic, convex, general continuous, and 
general integer). 

2. Applications include transportation planning and transshipment, factory production 
scheduling, and even determining a least-cost but adequate diet. 

3 . A modeling language , such as AMPL, is often used in applications. [FoGaKe93] 

Example: 

1 . Linear programming in low dimensions: This is a special case of linear program- 
ming since there are algorithms whose time is linear in the number of constraints but 
exponential in the number of dimensions. 

Application: 

1. Find the smallest enclosing ball of a set of points or a set of balls in arbitrary 
dimension, [We91]. This uses a randomized incremental algorithm employing the move 
to front heuristic. 


13.8.2 POLYHEDRAL COMBINATORICS 

Polyhedra have been important to geometry since the classification of regular polyhedra 
in classical times. 
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Fact: 


1. The following are some of the many possible operations that can be performed on 
polygons and polyhedra: 

• Boolean operations, such as intersection, union, and difference; 

• point location of new query points in a preprocessed set of polygons, which may 
partition a larger region; 

• range search of a preprocessed set of points to find those inside a new query 
polygon; 

• decomposition (or triangulation) of a polygon into triangles or a polyhedron into 
tetrahedra (§13.4.2). A simple n- gon is always decomposable into n — 2 triangles, 
in linear time, and all triangulations have exactly n — 2 triangles. However, in TZ 3 , 
some polyhedra cannot be partitioned intro tetrahedra without additional Steiner 
points. Also, different triangulations of the same polyhedron may have different 
numbers of tetrahedra. 


Applications: 

1. Aperiodic tilings and quasicrystals: Tilings (in the plane) and crystallography (in 
three-dimensional space) are classic applications of polygons and polyhedra. A recent 
development is the study of aperiodic (Penrose) tilings [Ga77] and quasicrystals [Ap94]. 
These are locally but not globally symmetric under 5-fold rotations, quasi-periodic with 
respect to translations, and self-similar. See the website 

http : // www . geom . umn . edu/ apps/quasit iler/ 

Large-scale symmetries, such as 5-fold, that are impossible in traditional crystallog- 
raphy, can be visible with X-ray diffraction. Tilings can be constructed by projecting 
simple objects from, say, TZ 5 . One application is the surface reinforcement of soft metals. 

2. Error-correcting codes: Some error-correcting codes can be visualized with poly- 
topes as follows. Assume that the goal is fc-bit symbols, where any error of up to b bits 
can be detected. The possible symbols are some of the 2 k vertices of the hypercube in 
fc-dimensional space. The set of symbols must contain no two symbols less than 6+1 
distance apart, where the metric is the number of different bits. If errors of up to c bits 
are to be correctable, then no two symbols can be closer than 2c + 1. 

Similarly, in quantum computing, a quantum error-correcting code can be designed 
using Clifford groups and binary orthogonal geometry. [CaEtal95] 


13.8.3 COMPUTATIONAL CONVEXITY 
Definitions: 

An Ti-polytope is a polytope defined as the intersection of m half-spaces in TZ n . 
A V -polytope is a polytope defined as the convex hull of m points in 7 Z n . 

A z onotope is the vector (Minkowski) sum of a finite number of line segments. 
Computational convexity is the study of high-dimensional convex bodies. 

An oracle is an algorithm that gives information about a convex body. 
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Facts: 

1. Computational complexity is related to linear programming, polyhedral combina- 
torics, and the algorithmetic theory of polytopes and convex bodies. 

2. In contrast to computational geometry, computational complexity considers convex 
structures in normed vector spaces of finite but not restricted dimension. If the body 
under consideration is more complex than a polytope or zonotope, it may be represented 
as an oracle. Here the body is a black-box, and all information about it, such as 
membership, is supplied by calls to the oracle function. Typical algorithms involve 
volume computation, either deterministically, or by Monte-Carlo methods, perhaps after 
decomposition into simpler bodies, such as simplices. 

3. When the dimension n is fixed, the volume of V-polytopes and 7Y-polytopes can be 
computed in polynomial time. 

4. There does not exist a polynomial-space algorithm for the exact computation of the 
volume of 7Y-polytopes (where n is part of the input). 

5. Additional information on computational convexity can be found at the following 
website: 

http : //dimacs .rutgers . edu/techps/ 1994/94-31 ,ps 


13.8.4 MOTION PLANNING IN ROBOTICS 

In Computer Assisted Manufacturing, both the tools and the parts being assembled 
must often be moved around each other in a cluttered environment. Their motion 
should be planned to avoid collisions, and then to minimize cost. 

Definitions: 

A Davenport-Schinzel sequence of order s over an alphabet of size n, or DS(n,s), 
is a sequence of characters such that: 

• no two consecutive characters are the same; 

• for any pair of characters, a and b, there is no alternating subsequence of length 
s + 2 of the form ...a. ..6. ..a. ..6.... 


Facts: 

1. Practical general motion, path planning , is solvable with Davenport-Schinzel se- 
quences [Sh95]. Upper bounds on A s (n), the length of the longest DS(n,s), determine 
upper bounds on the complexity of the lower envelopes of certain functions. 

For example, given n points in the plane that are moving with positions that are 
polynomials of degree s in time, the number of times that the closest pair of points can 
change is \ 2s (C(n,2)). 

2. Visibility graphs (§13.5.5) are useful in finding shortest path between two points in 
the plane, in the presence of obstacles. 

3. The problem of moving a finite object in the presence of obstacles may also be 
mapped into a configuration space (or C-space ) problem of moving a corresponding 
point in a higher dimension. If translational and rotational motion in 2-dimensional 
(respectively 3-dimensional) is allowed, then the C-space is 3-dimensional (respectively 
6-dimensional) . 

4. Articulated objects, multiple simultaneous motion, and robot hands also increase 
the number of degrees of freedom. 
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5. Current problems: 

• representation of objects, since, although planar, faceted, models are simpler, the 

objects should be algebraic surfaces, and, even if they are planar, in C-space 
their corresponding versions will be curved; 

• grasping, or placing a minimal number of fingers to constrain the object’s motion; 

• sequence planning of the assembly of a collection of parts; 

• autonomous navigation of robots in unstructured environments. 


13.8.5 CONVEX HULL APPLICATIONS 
Facts: 

1. The convex hull (§13.5.1) is related to the Voronoi diagram (§13.5.3) since a convex 
hull problem in TZ k is trivially reducible to a Voronoi diagram problem in TZ k , and a 
Voronoi diagram problem in TZ k is reducible to a convex hull problem in lZ k+1 . 

2. The definition of convex hull is not constructive, in that it does not lead to a method 
for finding the convex hull. Nevertheless, there are many constructive algorithms and 
implementations. One common implementation is QuickHull (§13.5.1), a general di- 
mension code for computing convex hulls, Delaunay triangulations, Voronoi vertices, 
furthest-site Voronoi vertices, and halfspace intersections. [BaDoHu95] 

Applications: 

See http : //www. geom.umn. edu/~bradb/qhull-news .html 

1. Mathematics : 

• determining the principal components of spectral data; 

• studying circuits of matroids that form a Hilbert base; 

• studying the neighbors of the origin in the TZ 8 lattice. 

2. Biology and medicine: 

• classifying molecules by their biological activity; 

• determining the shapes of left ventricles for electrical analysis of the heart. 

3. Engineering: 

• computing support structures for objects in layered manufacturing in rapid pro- 

totyping, [StBrEa95]. By supporting overhanging material, these prevent the 
object from toppling while partially built. 

• designing nonlinear controllers for controlling vibration; 

• finding invariant sets for delta-sigma modulators; 

• classifying handwritten digits; 

• analyzing the training sets for a multilayer perceptron model; 

• determining the operating characteristics of process equipment; 

• navigating robots; 

• creating 6-dimensional wrench spaces to measure the stability of robot grasps; 

• building micromagnetic models with irregular grain structures; 

• building geographical information systems; 

• simulating a spatial database system to evaluate spatial tesselations for indexing; 

• producing virtual reality systems; 

• performing discrete simulations of incompressible viscous fluids using vortex 

methods; 

• modeling subduction zones of tectonic plates and studying fluid flow and crystal 

deformation; 

• computing 3-dimensional unstructured meshes for computational fluid dynamics. 
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13.8.6 NEAREST NEIGHBOR 


Variants of the problem of finding the nearest pair of a set of points have applications 
in fields from handwriting recognition to astrophysics. 

Applications: 

1. Fixed search set, varying query point : A fixed set, V, of n points in lZ d is prepro- 
cessed so that the closest point p € V can be found for each query point, q. The search 
time per query can range from log?r (if d = 2) to n? (for large d). The Voronoi diagram 
is commonly used in low dimension. However, because of the Voronoi diagram’s com- 
plexity in higher dimension, hierarchical search structures, bucketing, and probabilistic 
methods perhaps returning approximate answers are common. 

2 . Moving points: The points in V may be moving and the close pairs of points over 
time is of interest. 

Examples: 

1. Examples of fixed search sets with varying query point: 

• Character recognition in document processing: Each representative character is 

defined by a vector of features. Each new, unknown character must be mapped 
to the closest representative character in feature space. 

• Color map optimization in computer graphics: Many frame buffers allow only 

the 256 colors in the current color map to be displayed simultaneously, from a 
palette of 2 24 possible colors. Thus, each color in a new image must be mapped 
to the closest color in the color map. A related problem is the problem of 
determining what colors to use in the color map. 

• Clustering algorithms for speech and image compression in multimedia systems: 

As in the color map problem, a large number of points must be quantized down 
to a smaller set. 

2 . Examples of moving points: 

• Simulation of star motion in astrophysics: Calculating the gravitational attrac- 

tion between every pair of stars is too costly, so only close pairs are individually 
calculated. Otherwise the stars are grouped, and the attraction between close 
groups is calculated. The groups may themselves be grouped hierarchically. 

• Molecular modeling: In molecular modeling, close pairs of atoms will be subject 

to vanderWaals forces. 

• Air traffic control: Air traffic controllers wish to know about pairs of aircraft 

closer than a minimum safe distance. Here the metric is nonuniform; small 
vertical separations are more tolerable than small horizontal separations. 

• During path planning in robotics and numerically controlled machining, unin- 

tended close pairs of objects must also be avoided. 


13.8.7 COMPUTER GRAPHICS 

Computer graphics may be divided into modeling of surfaces, and simulation of the 
models. The latter includes rendering a scene and its light sources to generate synthetic 
imagery with respect to some viewpoint. Rendering involves visibility, or determining 
which parts of the surfaces are visible, and shading them according to some lighting 
model. 
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Definitions: 

Anti-aliasing refers to filtering out high-frequency spatial components of a signal, to 
prevent artifacts, or aliases, from appearing in the output image. In graphics, a high 
frequency may be an object whose image is smaller than one pixel or a sharp edge of 
an object. 

A GUI ( graphical user interface ) is a mechanism that allows a user to interactively 
control a computer program with a bitmapped display by using a mouse or pointer 
to select menu items, move sliders or valuators, and so on. The keyboard is only 
occasionally used. A GUI contrasts with typing the program name followed by options 
on a command line, or by preparing a text file of commands for the program. A GUI is 
easier and more intuitive to use, but can slow down an expert user. 

Examples: 

1 . Visibility: Visibility algorithms may be object-space , where the visible parts of each 
object are determined, or image-space , where the color of each pixel in the frame buffer 
is determined. The latter is often simpler, but the output has less meaning, since it is 
not referred back to the original objects. Techniques include ray tracing and radiosity. 

• Ray tracing: Ray tracing extends a line from viewpoint through each pixel of the 

frame buffer until the first intersecting object. If that surface is a mirror, then 
the line is reflected from the surface and continues in a different direction until 
it hits another object (or leaves the scene). If the object is glass, then both a 
reflecting and a refracting line are continued, with their colors to be combined 
according to Fresnel’s law. 

One geometry problem here is that of sampling for subpixel averaging. The 
goal is to color a square pixel of a frame buffer according to the fraction of its 
area occupied by each visible object. Given a line diagonally crossing a pixel, 
the fraction of the pixel covered by that face must be obtained for anti-aliasing. 
If the edges of two faces intersect in this pixel, each face cannot be handled 
independently, for example with an anti-aliased Bresehnam algorithm. If this 
is done badly, then it is very obvious in the final image as a possible fringe of 
a different color around the border of the object, [Mi96]. 

The solution is to pick a small set of points in the pixel (typically 9, 16, 
or 64 points), determine which visible object projects to each point, and combine 
those colors. The problem is then to select a set of sampling points in the pixel, 
such that given a subset region, the number of points in it approximates its 
area. Four possible methods, from worst to best, are: 

o pick the points independently and uniform randomly; 
o use a nonrandom uniform distribution; 

o start with the above distribution, then jitter the points, or perturb 
each one slightly; 

❖ use simulated annealing to improve the point distribution. 

• Radiosity: Radiosity partitions the scene into facets, computes a form factor of 

how much light from each facet will impinge on each other, and solves a system 
of linear equations to determine each facet’s brightness. This models diffuse 
lighting particularly well. 

• Windowing systems: Another visibility problem is designing the appropriate 

data structure for representing the windows in a GUI, so that the window that 
is in front at any particular pixel location can be determined, in order to receive 
the input focus. 
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• Radio wave propagation : The transmission of radio waves, as from cellular 

telephones, which are reflected and absorbed by building contents, is another 
application of visibility. [Fo96] 

2. Computer vision: Applications of geometry to vision include model-based recogni- 
tion (or pattern matching ), and reconstruction or recovery of 3-D structure from 2-D 
images, such as stereopsis, and structure from motion. See the website 

http : //www. cs .princeton . edu/~chazelle/taskf orce/CGreport . ps 

In recognition, a model of an object is transformed into a sensor-based coordinate 
system and the transformation must be recovered. In reconstruction, the object must 
be determined from multiple projections. 

3. Medical image shape reconstruction: Various medical imaging methods, such as 
computer tomography, produce data in the form of successive parallel slices through 
the body. The basic step in reconstructing the 3-dimensional object from these slices in 
order to view it involves joining the corresponding vertices and edges of two polygons in 
parallel planes by triangles to form a simple polyhedron. However, there exists a pair 
of polygons that cannot be so joined. [GiO’RSu96] 


13.8.8 MECHANICAL ENGINEERING DESIGN AND MANUFACTURING 

Geometry is very applicable in CAD/CAM, such as in the design and manufacture of 
automobile bodies and parts, aircraft fuselages and parts such as turbine blades, and 
ship hulls and propellers. 

Examples: 

1. Representations: How should mechanical parts be represented? One problem is 
that geometric descriptions are verbose compared to 2-dimensional descriptions, such 
as draftings, since those assume certain things that the users will fill in as needed, 
but which must be explicit in the geometric description. Possible methods include the 
following: 

• constructive solid geometry: Primitive objects, such as cylinders and blocks, 

are combined with the regularized Boolean operators union, intersection, and 
difference. 

• faceted boundary representation: The object is a polyhedron with a boundary 

of planar faces. 

• exact boundary representation: The object is defined by boundary “faces”, but 

now each face can be curved, such as a NURBS (Non-Uniform Rational B- 
Spline), or an implicit piecewise quadric, Dupin cyclide (a quartic surface that 
is good for blending two quadric surfaces), or supercyclide. 

The possible methods can be evaluated with the following criteria: 

• robustness against numerical errors; 

• elegance; 

• accuracy in representing complex, curved, shapes, especially blends between the 

two surfaces at the intersection of two components; 

• ease of explicitly obtaining geometry such as the boundary. 
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2. Mesh generation : A mesh is the partition of a polyhedron into, typically, tetra- 
hedra or hexahedra to facilitate finite element modeling. A good mesher conforms to 
constraints, can change scale over a short distance, has no unnecessary long thin ele- 
ments, and has fewer elements when possible. In some applications, periodic remeshing 
is required. 

If the elements are tetrahedra, then a Delaunay criterion that the circumsphere of 
each tetrahedron contains no other vertices may be used. However, this is inappropriate 
in certain cases, such as just exterior to an airfoil, where a (hexahedral) element may 
have an aspect ratio of 100,000:1. This raises numerical computation issues. 

Applications of meshing outside mechanical design include computational fluid dy- 
namics, contouring in GIS, terrain databases for real time simulations, and Delaunay 
applications in general. See 

http : // www . cs . emu . edu/~quake/triangle . html 

3. Minimizing workpiece setup in NC machining: In 4- and 5-axis numerically con- 
trolled machining, in order to machine all the faces, the workpiece must be repeatedly 
dismounted, recalibrated, and remounted. This setup can take much more time than 
the actual machining. Minimizing the number of setups by maximizing the number 
of faces that can be machined in one setup is a visibility problem harder than finding 
an optimal set of observers to cover some geographic terrain. Exact solutions are NP- 
hard; approximate solutions use geometric duality, topological sweeping, and efficient 
construction and searching of polygon arrangements on a sphere. 

4. Dimensional tolerancing: Tolerancing refers to formally modeling the relationships 
between mechanical function and geometric form while assigning and analyzing dimen- 
sional tolerances to ensure that parts assemble interchangeably. [SrVo93] 

A tolerance may be specified parametrically, as a variation in a parameter, such 
as the width of a rectangle, or as a zone that the object’s boundary must remain in. 
The latter is more general but must be restricted to prohibit pathologies, such as the 
object’s boundary being not connected. 

Tolerance synthesis attempts to optimize the tolerances so as to minimize the man- 
ufacturing cost of an object, considering that, while large tolerances are cheaper to 
manufacture, the resulting product may function poorly. [Sk96] 


Unsolved Problems: 

The following lists some of the many remaining unsolved problems in applying geometry: 

1. Blending between two surfaces in mechanical design, especially at the ends of the 
blend, where these surfaces meet others. (A blending surface smooths the intersection 
of two surfaces by being tangent to them, each along a curve.) 

2. Variational design of a class of objects subject to constraints. Well-designed con- 
straint systems may have multiple solutions; the space must be searched for the correct 
one. Labeling derivative entities, such as the edge resulting from the intersection of two 
inputs is an issue, partly because this edge may not exist for some parameter values. 

3. Generally formalizing the semantics of solid modeling. [Ho96] 

4. Updating simplifying assumptions, such as the linearity of random access memory, 
and points being in general position, which were useful in the past, but which cause 
problems now. 

5. Accounting for dependencies between geometric primitives, and maintaining topo- 
logical consistency. 
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6. Designing robust algorithms when not only is there numerical roundoff during the 
computation, but also the input data are imprecise, for example, with faces not meeting 
properly. 

7. Better 3-dimensional anti-aliasing to remove crevices and similar database errors 
before rapid prototyping. 

8. There still remains a need for many features in geometry implementations, such as 
more geometric primitives at all levels, default visualization or animation easily callable 
for each data structure, more rapid prototyping with visualization, a visual debugger for 
geometric software, including changing objects online, and generally more interactivity, 
not just data-driven programs. 


13.8.9 LAYOUT PROBLEMS 

Efficiently laying out objects has wide-ranging applications in geometry. 

Examples: 

1. Textile part layout: The clothing industry cuts parts from stock material after 
performing a tight, nonoverlapping, layout of the parts, in order to minimize the costs 
of expensive material. Often, because the cloth is not rotationally symmetric, the parts 
may be translated, but not rotated. Therefore, geometric algorithms for minimizing 
the overlap of translating polygons are necessary. Since this problem is PSPACE-hard, 
heuristics must be used. [Da95] [LiMi95]. 

2. VSLI layout: Both laying out circuits and analyzing the layouts are important 
problems. The masks and materials of a VLSI integrated circuit design are typically 
represented as rectangles, mostly isothetic, although 45 degrees or more general angles 
of inclination for the edges are becoming common. The rectangles of different layers 
may overlap. One integrated circuit may be 50MB of data before its hierarchical data 
structure is flattened, or 2GB after. See the website: 

http : // ams . sunysb . edu/~held/proc_usb_comp_geo-95 .html 
Geometry problems include the following. 

• design rule verification: It is necessary to check that objects are separated by 

the proper distances and that average metal densities are appropriate for the 
fabrication process. 

• polygon simplification: A design described by a complex set of polygons may per- 

haps be optimized into a smaller set of isothetic polygons (with only horizontal 
and vertical sides), such that the symmetric difference from the original design 
is as small as possible. 

• logic verification: The electrical circuit is determined by the graph extracted from 

the adjacency information of the rectangles, and whether it matches the original 
logic design is determined. A subproblem is determining devices ( transistors ), 
which occur when rectangles of two particular different layers overlap. 

• capacitance: This depends on the closeness of the component rectangles, which 

might be overlapping or separated, representing two conductors. 

• PPC ( process proximity correction): This means to correct the effect that, when 

etching a circuit, a rectangle’s edges are displaced outward, possibly causing it 
to come too close to another rectangle, and change the circuit. 
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13.8.10 GRAPH DRAWING 


The classic field of graph drawing [TaTo94] aims automatically to display a graph, em- 
phasizing fundamental properties such as symmetry while minimizing the ratio between 
longest and shortest edges, number of edge crossings, etc. Applications include ad- 
vanced GUIs, visualization systems, databases, showing the interactions of individuals 
and groups in sociology, and illustrating connections between components in software 
engineering. Recent 3-dimensional visualization hardware now permits 3-dimensional 
graph drawing. See 

f ile : //f tp . cs .brown . edu/pub/papers/ compgeo/gdbiblio . ps . gz 

Facts: 

1. Graph G can be drawn as the 1-skeleton of a convex polytope in 7Z 3 if and only if G 
is planar and 3-connected. (Steinitz) [Gr67]. 

2. Given a 3-connected planar graph, the graph can be drawn as a convex polyhedron 
in 1Z 3 using 0(n) volume while requiring the vertices to be at least unit distance apart, 
which allows them to be visually distinguished. This can be done in time 0(n 1,s ). 
(Chrobak, Goodrich, Tamassia) See 

http : // ams . sunysb . edu/~held/proc_usb_comp_geo-95 .html 


13.8.11 GEOGRAPHIC INFORMATION SYSTEMS 

A map (§8.6.4) is a planar graph. Minimally, it contains vertices , edges, and polygons. 
However, a sequence of consecutive edges and 2-vertices is often called a chain (or 
polyline ), and its interior vertices points. For example, if each polygon is one nation, 
then the southern border of Canada with the USA, is one chain. 

Definition: 

A geographic information system ( GIS) is an information system designed to cap- 
ture, store, manipulate, analyze, and display spatial or geographically-referenced data. 

Facts: 

Typical simple geometric operations are given in Facts 1-6. More complex ones are 
given in Facts 7-9. 

1 . Projecting data from one map projection to another, and determining the appro- 
priate projection: Since the earth is not a developable surface, no projection meets 
all the following criteria simultaneously: equal-area, equidistant (preserving distances 
from one central point to every other point), conformal (preserving all angles), and 
azimuthal (correctly showing the compass angle from one central point to every other 
point). Since a projection that meets any one criterion exactly is quite bad in the oth- 
ers, the most useful projections tend to be compromises, such as the recent Robinson 
projection. [Da95] 

2. Rubber-sheeting, or nonlinear stretching, to align a map with calibration points, and 
for edge joining of adjacent map sheets or databases, which may have slightly different 
coordinate systems. 

3. Generalizing or reducing the number of points in a chain while preserving certain 
error properties. 
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4 . Topological cleanup so that edges that are supposed to meet at one vertex do so, 
the boundary of each polygon is a closed sequence of vertices and polylines, adjacency 
information is correct, and so on. 

5 . Choice of the correct data structure. Should elevation data be represented in a 
gridded form (as an array of elevations) or should a triangulated irregular network 
(TIN) be used (the surface is partitioned into triangles)? 

6 . Zone of influence calculation : For example, find all the national monuments within 
ten miles of the highway. 

7 . Overlaying: Overlaying two maps to produce a third, where one polygon of the 
overlay map will be those points that are all from the same two polygons of the two 
input maps is one of the most complex operations in a GIS. If only the area or other 
mass property of the overlay polygons is desired, then it is not necessary completely 
to find the overlay polygons first; it is sufficient to find the set of vertices and their 
neighborhoods of each overlay polygon. [FrEtal94] 

8 . Name placement: Consider a cartographic map containing point features such 
as cities, line features such as rivers, and area features such as states. The name 
placement problem involves locating the features’ names so as to maximize readability 
and aesthetics [FrAh84] . Efficient solutions become more important as various mapping 
packages now produce maps on demand. The techniques also extend to labelling CAD 
drawings, such as piping layouts and wiring diagrams. 

9 . Viewsheds and visibility indices: Consider a terrain database, and an observer and 
target, both of which may be some distance above the terrain. The observer can see 
the target if and only if a line between them does not intersect the terrain. Note that 
if they are at different heights above the terrain, then this relation is not necessarily 
commutative. 

The (not necessarily connected) polygon of possible targets visible by a particular 
observer is his viewshed. The viewshed’s area is the observer’s visibility index. In 
order to site observers optimally, the visibility index for each possible observer in the 
database may be required. Calculating this exactly for an nx n gridded database takes 
time 0(n 5 ) so sampling techniques are used. [FrRa94] 


13.8.12 GEOMETRIC CONSTRAINT SOLVING 

Applications of geometric constraint solving include mechanical engineering, molecular 
modeling, geometric theorem proving, and surveying. 

Definition: 

Geometric constraint solving is the problem of locating a set of geometric elements 
given a set of constraints among them. 

Fact: 

1. The problem may be under-constrained, with an infinite number of solutions, or 
over-constrained, with no solutions without some relaxation. 

Examples: 

1. A receptor is a rigid cavity in a protein, which is the center of activity for some 
reaction. A ligand is a small molecule that may bind at a receptor. The activity level of 
a drug may depend on how the ligand fits the receptor, which is made more complicated 
by the protein molecule’s bending. 
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2. In CAD/CAM, there may be constraints such as that opposite sides of a feature 
shall be parallel. For example, commercial systems like Pro/Engineer allow the user 
to freehand-sketch a part, and then apply constraints such as right angles, to snap the 
drawing to fit. Then the user is required to add more constraints until the part is 
well-constrained. 

3. Molecular modeling : There is often a lock-and-key relationship between a flexible 
protein molecule’s receptor and the ligand that it binds. In addition to geometrically 
matching the fitted shapes, the surface potentials of the molecules is also important. 
This fitting problem, called molecular docking, is important in computer-aided drug 
design. Generally, a heuristic strategy is used to move the molecules to achieve no 
overlap between the two molecules while maximizing their contact area. (Ierardi and 
Park) See 

http : // ams . sunysb . edu/~held/proc_usb_comp_geo-95 .html 


13.8.13 IMPLEMENTATIONS 

One major application of geometry is in implementations of geometric software packages, 
either as subroutine packages callable from user programs, or as standalone systems, 
which the user prepares input data files for and directs with either input command files 
or a GUI. 

Definition: 

In an object-oriented computer language, a class library is a set of new data types 
and operations on them, activated by sending an object, or data item, a message. 
(For example, a plane object may respond to a message to rotate itself. The internal 
representation of an object is private, and it may be accessed only by sending it a 
message.) 

Examples: 

1. Leda, started in 1989, is a major C++ class library, whose design goals are correct- 
ness, ease of use and elegance, and efficiency [MeNa95]. Its geometry has been moved to 
CGAL, which often uses exact computation and aims for efficiency in a general-purpose 
professional-quality library of geometric algorithms written in C + + for Unix first, and 
then PCs. See the website 

http : //www. cs .ruu.nl/people/geert/CGAL/ 

2. Stand-alone systems: 

• Geomview is an interactive program for viewing and manipulating geometric 

objects, from the University of Minnesota Geometry Center. Examples like 
Penrose quasi-tiling, Pascal’s theorem in projective conics, Teichmuller space, 
and families of Riemann surfaces with a specified group of symmetries. See the 
website 

http : // www . geom . umn . edu/ apps/gallery . html 

• Geomamos, a “Geometric Object Manipulation/Monitoring System” is an X- 

based visualization system with a 2-D display, GeomSheet, based on X-fig, 
which allows mouse input. It includes a library based on LEDA. See the website 

http : //web . eecs . nwu . edu/ "theory/ geomamos . html 
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• XYZ Geobench assists the implementation of geometric algorithms. [Sc91] 


13.8.14 GEOMETRIC VISUALIZATION 

There are many packages to display geometric objects. D. Banks lists many examples, 
by himself and others, such as D. Cox, G. Francis, and R. Idaszak, including a torus 
rotating in 77 4 , a Steiner surface showing the triple point, crosscaps, Klein bottles, a 
Sudanese surface, a complex reciprocal, a knotted sphere, the Etruscan Venus, a stable 
mapping of a Klein bottle into 3-dimensional Euclidian space. See the website 
http : //www. icase . edu/~banks/math.html 

There are also libraries of minimal and other surfaces, and knots, at the Center for 
Geometry Analysis Numerics and Graphics, U Mass Amherst. 
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INTRODUCTION 

This chapter deals with techniques for the efficient, reliable, and secure transmission 
of data over communications channels that may be subject to non-malicious errors and 
adversarial intrusion. The general topic areas related to these techniques are information 
theory, coding theory, and cryptology. 

Information theory is concerned with the mathematical theory of communication, 
and includes the study of redundancy and the underlying limits of communications 
channels. 

Coding theory, in its broadest sense, deals with the translation between source data 
representations and the corresponding representative symbols used to transmit source 
data over a communications channels, or store this data. Error-correcting coding is the 
part of coding theory that adds systematic redundancy to messages to allow transmission 
errors to not only be detected, but also to be corrected. 

Cryptology is the field which includes both cryptography, which deals with the 
protection of data from malicious or unauthorized actions, and cryptanalysis, which 
attempts to defeat cryptographic mechanisms. 


GLOSSARY 

affine cipher: a cipher that replaces the plaintext letter x (represented as the appro- 
priate integer in the set {0, 1, . . . , 25}) by ax + b mod 26, where a and b are integers 
relatively prime to 26. 

analog channel: a channel that is continuous in amplitude and time. 
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authentication : corroboration that a party, or the origin of a message, is as claimed. 

BCH code : a code from a special family of cyclic codes. 

binary symmetric channel (BSC): a memoryless channel with binary input and 
output alphabets, and fixed probability p that a symbol is transmitted incorrectly. 

block cipher: a cipher that processes the plaintext after grouping it into pieces called 
blocks. 

burst error : a vector whose only nonzero entries are among a string of successive 
components, the first and last of which are nonzero. 

Caesar cipher: the cipher that shifts each letter forward three positions in the al- 
phabet, wrapping the letters at the end of the alphabet around to the beginning 
letters. 

capacity of a channel: a measure of the ability of a channel to transmit information 
reliably. 

certification authority: a trusted authority who verifies the identity and public key 
of a party, and signs this data. 

chosen-plaintext attack: an attack when the adversary has some chosen plaintext 
and its corresponding ciphertext. 

chosen-ciphertext attack: an attack when the adversary has some chosen ciphertext, 
and its corresponding plaintext. 

cipher: an encryption scheme. 

cipher-block chaining ( CBC ) mode: a mode of operation of an n-bit block cipher 
in which plaintext is processed n bits at a time, an initialization block is used, and to 
encrypt each successive ?7-bit block the bitwise XOR of the block with the encrypted 
version of the previous block is formed and the resulting n-bit block is encrypted by 
the block cipher. 

cipher feedback ( CFB ) mode: a mode of operation of an n-bit block cipher in which 
plaintext may be processed r bits at a time where 1 < r < n and in which ciphertext 
depends on the current block and previous blocks. 

ciphertext: transformed plaintext that is supposed to be unintelligible to all but an 
authorized recipient. 

ciphertext-only attack: an attack when the adversary has possession of some ci- 
phertext and nothing else. 

code: a map from the set of words to the set of all finite strings of elements of a 
designated alphabet. 

codeword: a string produced when a code is applied to a word. 

coding theory : the subject concerned with the translation between source data rep- 
resentations and the corresponding representative symbols used to transmit source 
data over a communications channel. 

complete maximum likelihood decoding ( CMLD ): the decoding scheme that 
decodes a received n-tuple to the unique codeword of minimum distance from this 
n-tuple, if such a codeword exists. Otherwise, the scheme arbitrarily decodes the 
n-tuple to one of the codewords closest to this 77 -tuple. 

computational security: the amount of computational effort required by the best 
currently-known attacks to defeat a system. 
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convolutional code: a code in which the encoder has memory, so that an n-tuple 
produced by the encoder not only depends on the message /e-tuple u, but also on 
some message fc-tuples produced prior to u. 

coset: the set C + x = {c + x\ c £ C} determined by a word x, given a code C. 

coset leader: a coset member of smallest Hamming weight. 

cryptanalysis : the science devoted to the defeat of cryptographic protection mecha- 
nisms. 

cryptography : the science of protecting data from malicious or unauthorized actions. 

cryptology: the field that includes both cryptography and cryptanalysis. 

cryptosystem (or cryptographic system) : a system comprised of a space of plain- 
text messages, a space of ciphertext messages, a space of keys, and families of enci- 
phering and deciphering functions. 

cyclic code: a linear code in which every cyclic shift of a codeword is also a codeword. 

data compression: the transformation of data into a representation which is more 
compact yet maintains the information content of the original data. 

data encryption standard ( DES ): a block cipher adopted as a standard in the 
United States and which is widely used for commercial applications. 

data integrity: the ability to detect data manipulation by unauthorized parties. 

data origin authentication: corroboration that the origin of data is as claimed. 

decryption: the process of recovering plaintext from ciphertext. 

digital signature: a number dependent on some secret known only to the signer, and 
on the message being signed. 

dual code (of a code): the orthogonal complement of the code. 

electronic codebook ( ECB ) mode: a mode of operation of a n-bit block cipher in 
which long messages are partitioned into n-bit blocks and encrypted separately. 

El Gamal cryptosystem: a public-key cryptosystem based on the discrete logarithm 
problem. 

encryption: the process of mapping plaintext to ciphertext designed to render data 
unintelligible to all but the intended recipient. 

entity authentication: corroboration that a party’s identity is as claimed. 

entropy: a measure of the amount of information provided by an observation of a 
random variable. 

equivalent codes: codes for which there is a fixed permutation of the coordinate 
positions which transform one code to the other. 

error-correction coding: coding that adds systematic redundancy to messages to 
allow transmission errors to be detected and corrected. 

error-detection coding: coding that adds systematic redundancy to messages to 
allow transmission errors to be detected (but not necessarily corrected). 

extended code: the code obtained by adding a parity check symbol to each codeword 
of a code. 

generator matrix for a code: a matrix whose rows form a basis for that code. 

generator polynomial: a monic polynomial of least degree in a cyclic code. 
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Golay code: a particular perfect code. 

Hamming code: a perfect single-error correcting code. 

Hamming distance between two n-tuples: the number of coordinate positions in 
which they differ. 

Hamming distance of a code: the smallest Hamming distance over all pairs of 
distinct codewords in that code. 

Hamming weight of an n-tuple: the number of nonzero coordinates. 

hash function: a function that maps arbitrary length bit strings to small fixed-length 
outputs that is easy to compute and in addition may have preimage-resistance, weak 
collision-resistance, and/or strong collision-resistance. 

Hill cipher: a cipher that has an m x m matrix K as its key and which encrypts 
plaintext by splitting it into blocks of size m and sending the plaintext block x = 
(xi,X2, • • • , x m ) to the m-tuple xK. 

homophonic substitution: a cipher where plaintext characters in the source lan- 
guage are associated with disjoint sets of ciphertext characters, and each time a 
character is to be encrypted, one element of the associated set of ciphertext charac- 
ters is randomly chosen. 

incomplete maximum likelihood decoding ( IMLD ): the decoding scheme that 
decodes a received n-tuple to a unique codeword such that the distance between 
the n-tuple and the codeword is a minimum if such a codeword exists. If no such 
codeword exists, then the scheme reports that errors have been detected, but no 
correction is possible. 

information theory : the mathematical theory of communication concerned with both 
the study of redundancy and the underlying limits of communication channels. 

Kerberos protocol: an authenticated key distribution protocol developed as part of 
Project Athena at M.I.T. based on symmetric cryptographic techniques and the use 
of a key distribution center. 

.key agreement: a key establishment mechanism in which two parties jointly establish 
a shared secret key which is a function of information contributed by each. 

key distribution center ( KDC ): a trusted third party who distributes short-term 
secret keys for secure communications from a particular party to another. 

key distribution problem: the problem of how to securely distribute secret keys 
between two or more parties. 

key establishment: a mechanism with the specific objective of making a symmetric 
key secretly available to two authorized parties for subsequent cryptographic use. 

key transfer: a key establishment mechanism in which a key created by one party is 
securely transmitted to another. 

knapsack cryptosystem: a cryptosystem in which encryption is carried out using a 
super- increasing sequence of integers. 

known-plaintext attack: an attack when the adversary has some plaintext and its 
corresponding ciphertext. 

linear code: a subspace of the set of n-tuples with entries from a finite field. 

McEliece cryptosystem: a public- key cryptosystem based on linear codes from the 
theory of error-correcting codes. 
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message: a finite string of source words. 

minimum error probability decoding (MED): the decoding scheme that decodes 
a received n-tuple r to a codeword c for which the conditional probability P(c is sent 
| r is received), c € C, is largest. 

memoryless source: a source for which the probability of a particular word being 
emitted at any point in time is fixed. 

modem: a device that transforms between analog channel data and discrete encoder- 
decoder data; modem is short for modulator/demodulator. 

non-repudiation: a provision for the resolution of disputes arising related to digital 
signatures where the purported sender of a message denies having sent it. 

Nordstrom-Robinson code: a special nonlinear code. 

one-time pad: a stream cipher where each bit of plaintext is encrypted by XOR-ing 
it to the next bit of a truly random key, which is never reused for encryption and is 
of bit length equal to that of the plaintext. 

output feedback ( OFB ) mode: a mode of operation of an n-bit block cipher in 
which a message may be split into blocks of r bits where 1 < r < n for processing 
and in which error propagation is avoided. 

parity check bit: a bit added to a bit string so that the total number of Is in the 
extended string is even. 

parity check matrix (for a code): a generator matrix for the dual code of the code. 

perfect code: a code of distance d for which every word is within distance t = 
of some codeword. 

plaintext: a message in some source language. 

polyalphabetic cipher: a cipher that uses multiple substitutions for mapping plain- 
text letters to ciphertext letters. 

Preparata code: a code from an infinite family of nonlinear codes that have efficient 
encoding and decoding algorithms. 

privacy: preventing confidential data from being available in an intelligible form to 
unauthorized parties. 

prefix code: a code in which no codeword is a prefix of another codeword. 

provable security (of a cryptographic method): security where the difficulty of de- 
feating the method is essentially as difficult as solving a well-known and supposedly 
difficult problem. 

public-key certificate: data that binds together a party’s identification and public 
key. 

public-key cryptosystem: a cryptosystem in which each user has his/her own pair 
of encryption (public) and decryption (private) keys. 

punctured code: the code obtained by removing any column of a generator matrix 
of a linear code. 

Rabin cryptosystem: a public-key cryptosystem whose security depends on the dif- 
ficulty of finding square roots modulo the product of two large primes. 

Reed-Muller code: a code from a particular family of linear codes. 

Reed-Solomon code: a code from a special family of BCH codes. 
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RSA cryptosystem : a public-key cryptosystem in which encryption is based on mod- 
ular exponentiation with a modulus that is the product of two primes. 

secret sharing scheme: a scheme where the contents of a secret can be recovered 
if and only if particular groups of people sharing information relating to the secret 
collaborate. 

self-dual code: a linear code that is equal to its dual code. 

self-orthogonal code: a linear code that is contained in its dual code. 

self-synchronizing stream cipher : a stream cipher capable of reestablishing proper 
decryption automatically after loss of synchronization, with only a fixed number of 
plaintext characters unrecoverable. 

shift cipher: a cipher that replaces each plaintext letter by the letter shifted a fixed 
number of positions in the alphabet, with letters at the end of the alphabet shifted 
to the beginning of the alphabet. 

shortened code: the set of all codewords in a linear code which are 0 in a fixed 
coordinate position with that position deleted. 

syndrome (of a word x): the vector xH T , where H a parity check matrix for a linear 
code C. 

stream cipher: a cipher which encrypts individual characters of a plaintext message. 

substitution cipher: a cipher that replaces each plaintext character by a fixed sub- 
stitute according to a permutation on the source alphabet. 

super-increasing sequence: a set {oi, a 2 , . . . , a n } of positive integers with the prop- 
erty that cii > X^=i a j f° r eac h * = 2, . . . , n. 

symmetric-key system: a cryptosystem where each pair of users share a secret key. 

synchronous stream cipher : a stream cipher in which the keystream is generated 
independently of the message. 

systematic code: a linear code that has a generator matrix of the form [Ik \ A], 

( n,k)-threshold scheme: a scheme whereby a secret datum S can be divided up 
into n pieces, in such a way that knowledge of any k or more pieces allows S to 
be easily recovered, but knowledge of k— 1 or fewer pieces provides no information 
about S. 

transposition cipher: a cipher that divides plaintext into blocks of a fixed size and 
rearranges the characters in each block according to a fixed permutation. 

turbo code: a special type of code built using convolutional codes and an interleaver 
which permutes the original bits before sending them to the second encoder. 

unconditional security (for encryption schemes): the security condition where ob- 
servation of the ciphertext provides no information to an adversary. 

uniquely decodable code: a code for which every string of symbols is the image of 
at most one message. 

Vernam cipher: a one-time pad. 

Vigenere cipher: a cipher with a d-tuple (fci, . . . , k r j) as its key that encrypts plaintext 
messages in blocks of size d, so that the zth letter in a block is shifted k, positions in 
the alphabet, modulo 26. 
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1 4.1 COMMUNICATION SYSTEMS AND INFORMATION THEORY 


14.1.1 BASIC CONCEPTS 
Definitions: 

A communication system, as illustrated in the following figure, is modeled as a data 
source providing either continuous or discrete output, a source encoder transforming 
source data into binary digits (bits), a channel encoder, and a channel. 
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In many communication systems the channel is analog, that is, continuous in amplitude 
and time, in which case a modulator/demodulator (modem) is required to transform 
between analog channel data and discrete encoder/decoder data. 

The source encoder, the aim of which is to minimize the number of bits required to 
represent source data while still allowing subsequent reconstruction, typically includes 
data compression to remove unnecessary redundancy. 

The objective of the channel encoder is to maximize the rate at which information 
can be reliably conveyed by the channel, in the presence of disruptive channel noise. 

Coding theory is the study of the translation between source data representations 
and the corresponding representative symbols (coded data) used to transmit source 
data over a communication channel. 

Error-correction coding, located in the channel encoder, adds systematic redundancy 
to messages to allow transmission errors not only to be detected but also to be corrected. 

Encryption, located after the source encoder but not always before the channel en- 
coder, is designed to render data unintelligible to all but the intended recipient, and 
thereby preserve the secrecy of messages in the presence of unfriendly monitoring of 
the channel. 

Cryptography is the science of maintaining secrecy of data, that is, protecting data 
from malicious or unauthorized actions, including passive intrusion (eavesdropping) 
and active intrusion (injection or modification of messages). 
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Authentication is corroboration that a party, or the origin of a message, is as claimed. 

Data integrity is the property that data have not been modified in an unauthorized 
manner. 

Non-repudiation is the preclusion of parties from making undetectable false denials. 


14.1.2 ENTROPY 
Definitions: 

Let X be a random variable that takes on a finite set of values X\, X2, ■ ■ ■ , x n with 
probability P(X = Xi) = pi, where 0 < pi < 1 for each i, 1 < i < n, and Pi = 1- 
Also, let Y be a random variable that takes on a finite set of values. 

Information theory is concerned with a mathematical theory of communication and 
includes the study of redundancy and the underlying limits of communication channels. 

The entropy (or uncertainty ) of X is defined to be H(X) = — Y^i=iP'i^°g 2 Pii where 
Pilog 2 Pi = 0 if pi = 0. 

The joint entropy of X and Y is defined to be 

H(X, Y) = - Z x , y P{X= x , Y= y) log 2 P(X= x, Y= y ). 

If A' and Y are random variables, the conditional entropy of X given Y = y is 
H( X | Y=y) = - P(X= x | Y= y) log 2 P(X= x\ Y=y). 

The conditional entropy of X given Y (or equivocation of Y about A), is 

H( X | Y) = — P{Y=y) H( X \ Y= y). (The summation indices x and y range over 

all values of A and Y, respectively.) 

Facts: 

1. Useful books that cover information theory include [Ha80], [HaHaJo97], [Mc77], 
[Re94], and [We88]. 

2. Information theory provides a theoretical basis for many results in error-correcting 
codes and cryptography, and provides theoretical bounds useful as metrics for evaluating 
conjectures in both areas. 

3 . The entropy of X is a measure of the amount of information provided by an obser- 
vation of A. 

4. The entropy of A is also useful for approximating the number of bits required to 
encode the elements of A. 

5 . If A and Y are random variables, then: 

• 0 < H( X) < log 2 n\ 

• H(X) = 0 if and only if pi = 1 for some i, and pj = 0 for all j ^ i (that is, there 

is no uncertainty of the result); 

• H( A) = log 2 n if and only if p t = 1 for each i, 1 < i < n (that is, all outcomes 

are equally likely); 

• H(X,Y) < H(X) + H(Y); 

• H{ A, Y) = H(X) + H(Y) if and only if X and Y are independent. 

6 . The quantity H{X\Y) measures the amount of uncertainty remaining about X 
after Y has been observed. 
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7. If X and Y are random variables, then: 

• H{X | Y) > 0; 

• H{X\X) = 0; 

• H(X,Y)=H(Y) + H(X |y); 

. H(X\Y)<H(X)-, 

• H(X | Y) = H(X) if and only if X and Y are independent. . 

Example: 

1. If X is the random variable on the set {xi, X 2 , X 3 , X 4 } with X(x\) = 0.4, X(x 2 ) = 0.3, 
X (X 3 ) = 0.2, and X(x&) = 0.1, then the entropy of X is 

H(X) = — (0.41og 2 0.4 + 0.31og 2 0.3 + 0.21og 2 0.2 + 0.11og 2 0.1) « -1.84644. 


14.1.3 THE NOISELESS CODING THEOREM 
Definitions: 

A source is a stream of words from a set W = {wi,W 2 , ...,%}. 

Let X. t denote the itli word produced by a source. The source is said to be memoryless 
if for each word Wj £ W, the probability P(Xi-Wj) = Pj is independent of i, that is, 
the X, are independent and identically distributed random variables. 

The entropy of a memory less source is H = — Y^jLi Pj l°g 2 Pj- 

A code is a map / from W to A*, the set of all finite strings of elements of A where A 
is a finite set called the alphabet. 

For each source word Wj £ W, the string f(wj) is a codeword. 

The length of the codeword f(wj ), denoted / (wj ) | , is the number of symbols in the 
string. 

A message is any finite string of source words. If m = V 1 V 2 ■ ■ .v r is a message, then 
its encoding is obtained by concatenation: f(m) = f(vi)f(v 2 ) ■ ■ ■ /(hr)- 

The average length of a code / is YljLiPj\fi w j)\- 

A code is uniquely decodable if every string from A* is the image of at most one 
message. 

A prefix code is a code such that there do not exist distinct words and Wj such 
that f(wi) is an initial segment, or prefix, of f(wj). 

Facts: 

1. Prefix codes are uniquely decodable. 

2. Prefix codes have the advantage of being instantaneous. That is, they can be de- 
coded online without looking at future codewords. 

3. Kraft’s inequality: A prefix code /: W — > A* with codeword length |/(u>j)| = Z,; for 
i = 1, 2, . . . , M exists if and only if n ~ lj — 1) where n is the size of the alphabet 
A. 

4. Macmillan’s inequality: If a uniquely decodable code /: W — » A* with codeword 
lengths l\, I 2 , ... ,1m exists, then J2jLi n ~ lj < 1- where n is the size of the alphabet A. 

5. A uniquely decodable code with prescribed word lengths exists if and only if a prefix 
code with the same word lengths exists. As a result, attention can be restricted to prefix 
codes. 


© 2000 by CRC Press LLC 



6. Shannon’s noiseless coding theorem: For a memoryless source of entropy H , any 
uniquely clecodable code for the source into an alphabet of size n must have average 
length at least lo ^ n - Moreover, there exists such a code having average length less than 


7. For a memoryless source, a prefix code with smallest possible average length can be 
constructed by the Huffman coding algorithm. (See §9.1.2.) 

Examples: 

1. The code that maps the letters A, B, C, D to 1, 01, 001, 0001, respectively, is a 
prefix code on this set of four letters. 

2. The code that maps the letters A, B, C, D to 11, 111, 11111, 111111, respectively, 
is not a prefix code since the code for A forms the first part of the code for B (and 
for the codes for C and D as well) . It is also not uniquely decodable since a bit string 
can correspond to more than one string of the letters A, B, C, D. For example, 11111 
corresponds to AB, BA, and C. 


14.1.4 CHANNELS AND CHANNEL CAPACITY 

Definitions: 

A channel is a medium that accepts strings of symbols from a finite alphabet A = 
{ai, . . . , a n } and produces strings of symbols from a finite alphabet B = {b\, ... ,b m }. 

Let Xi denote the itli input symbol and let 1) denote the it li output symbol. The 
channel is said to be memoryless if the probability P(Yi = bj \Xi = au) = Pjk (for 
1 < } < m and 1 < k < n) is independent of i. 

A binary symmetric channel (BSC) is a memoryless channel with input and out- 
put alphabets {0,1}, and probability p that a symbol is transmitted incorrectly. The 
probability p is called the symbol error probability of the channel. See the following 
figure. 



A q -ary symmetric channel is a memoryless channel with input and output alphabets 
each of size q and such that the probability that an error occurs on symbol transmission 
is a constant p. Furthermore, if an error does occur then each of the q — 1 symbols 
different from the correct symbol is equally likely to be received. 

The capacity of a binary symmetric channel with symbol error probability p is C (p) = 
1 +plog 2 p+ (1 -p) log 2 (l -p). 
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Facts: 

1. The capacity of a communications channel is a (unitless) measure of its ability to 
transmit information reliably. 

2. The capacity of a BSC with symbol error probability p is a monotone decreasing 
function of p for 0 < p < with 1 > C(p) > 0. Moreover, C(0) = 1 and (7(|) = 0. 

Example: 

1. The capacity of a BSC with symbol error probability 0.01 is given by 
C(0.01) = 1 + 0.01 log 2 (0.01) + 0.99 log 2 (0.99) « 0.92. 


14.2 BASICS OF CODING THEORY 

Coding theory is the subject devoted to the theory of error-correcting codes. Error- 
correcting codes were invented to correct errors over unreliable transmission links. With 
digital communications and digital storage media ubiquitous in the modern world, error- 
correcting codes have grown in importance. Advances in error-correcting codes have 
made it possible to transmit information across the solar system using weak transmitters 
and to store data robustly on storage media so that it is resistant to damage, such as 
scratches on a compact disk. 

Error-correcting codes work by encoding data as strings of symbols, such as bit 
strings, that contain redundant information that helps identify which codeword may 
have been sent when a string of symbols, potentially different than the string sent, is 
received. Coding theory is an active area, with new and better codes being devised at 
a steady pace. 


14.2.1 FUNDAMENTAL CONCEPTS 
Definitions: 

Let A be any finite set (called an alphabet), and let A n denote the set of all n-tuples with 
entries in A. A block code of length n containing M codewords over the alphabet A 
is a subset of A n of size M. Such a block code is called an [n, M]-code over A. 

The Hamming distance d(x, y) between two n-tuples x and y G A n is the number of 
entries in which they differ. 

Let C be an [n, M]-code over A. The Hamming distance of C is the smallest Ham- 
ming distance over all pairs of distinct codewords in C. If C has Hamming distance d, 
then C is sometimes referred to as an [n, M, d]-code. 

The information rate (or rate ) of an [ n , M]-code over an alphabet of size q is R = 

log,, m 

n 

Suppose that a codeword c from a block code is transmitted and r is received. The 
error-vector is e = r — c (formed by subtracting componentwise). The number of 
errors is the number of nonzero components in e. 
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A code is said to detect t errors if the decoder is capable of detecting any pattern of t 
or fewer errors per codeword that may be introduced by the channel. 

A code is said to correct t errors if the decoder is capable of correcting any pattern 
of t or fewer errors per codeword that may be introduced by the channel. 

If Ci and C 2 are two [n, M]-codes over an alphabet A, then Ci and C 2 are said to 
be equivalent codes if there is a fixed permutation of the coordinate positions which 
transform one code to the other. 

The parity check bit of a bit string is 0 if there are an even number of bits in the 
string and is 1 if there are an odd number of bits in the string. 


Facts: 

1. Some of the many introductory-level books in coding theory are [Ba97], [Hi86], 
[HoEtal92], [P189], [Pr92], [Ro96], [Vava89], and [We98]. For more extensive treat- 
ments, see [Be84] , [B183] , [LiCo83] , [PlHuBr98] , [va90] , [va99] , [PeWe72] , and especially 
[MaS177], which contains a bibliography of 1478 entries. 

2. The Error Correcting Codes (ECC) home page provides free software implementing 
several important error-correcting codes: 

http : // imailab . iis .u-tokyo . ac . jp/~robert/codes .html 

3. The main objective of coding theory is the design of codes such that: 

• an efficient algorithm is known for encoding messages; 

• an efficient algorithm is known for decoding; 

• the error-correcting capability of the code is high; 

• the information rate of the code is high. 

4. For applications in which a two-way communications channel is available (for exam- 
ple, a telephone circuit), it is sometimes economical to use error detection and retrans- 
mission upon error, in a so-called automatic repeat request (ARQ) strategy, rather than 
so-called forward error correction (FEC) techniques capable of actually correcting errors 
at the cost of more complex decoding equipment. This is not an option when the com- 
munications channel is effectively one-way or unperturbed source data is not available 
for retransmission (for example in CD-ROM storage and deep-space communications 
systems). 

5. For any n-tuples x,y,z £ A n , the Hamming distance satisfies the following: 

• d(x, y) > 0 with equality if and only if x = y: 

• d(x, y) = d(y, x)\ 

• d(x , y) + d(y , z) > d(x, z). 

6. The information rate R of a block code measures the fraction of information of the 
code which is non-redundant; the information rate R satisfies the inequality 0 < R < 1. 

7. When a word r is received, the decoder must make some decision. This decision 
may be one of the following: 

• no errors have occurred; accept r as a codeword; 

• errors have occurred; correct r to a codeword c; 

• errors have occurred; no correction is possible. 

8. Let C be an [ n , M, d]-code. 

• If used only for error detection, C can detect d— 1 errors. 

• If used for error correction, C can correct errors. 
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9 . Equivalent codes have the same distance, and hence the same error-correcting ca- 
pabilities. 

10 . Adding a parity check bit to a bit string of length n produces a bit string of length 
n + 1 with an even number of Os. 

11 . Different families of error-correcting codes have been, and continue to be, designed 
to meet the particular requirements of applications. One type of requirement is the 
ability to correct specific types of errors. For example, when signals are sent over radio 
channels, including those from deep space, interference can produce errors in a run of 
bits. Similarly, damage to storage media, such as a compact disk, can produce errors 
that come in clusters. Some of the codes designed to correct errors of these types, known 
as burst errors, are Reed Solomon codes (§14.3.7), interleaved Reed-Solomon codes (see 
[Vava89] for more information) and fire codes (see [B183] ) . 

Examples: 

1 . The code produced by adding a parity check bit to each bit string of length n can 
detect a single error. (It detects an odd number of errors, but not an even number of 
errors; no error correction is possible using this code.) For example, suppose the bit 
string 0111 is received where the code word sent is a bit string of length three with a 
parity check bit added. Since 0111 contains three Is, it cannot be a codeword. Hence, 
an error was made in transmission. This error cannot be corrected. To see this, note 
that if exactly one bit error was made in the transmission, any of the codewords 0110, 
0101, 0011, and 1111 could have been sent. 

2 . G = {0100011,1010101,1101111} is a [7, 3, 3]-code over the binary alphabet. The 
information rate of C is R = Io ^, 2 3 ss 0.226. 

3 . The binary repetition code of length n is the code C = {00 ... 0, 11 ... 1}. The code 

has distance n, and so can correct errors - If used only for error detection, then C 

can detect n— 1 errors. Although the error-correcting capabilities of C are very good, 
its information rate R = I; is very poor. 


14.2.2 MAXIMUM LIKELIHOOD DECODING 
Definitions: 

Suppose C is an [n,M,d)~ code. Different decoding schemes can be used to recover a 
codeword from a transmitted bit string received with possible errors. These schemes 
include the following: 

•Minimum Error Probability Decoding (MED): If an n-tuple r is received, 
then correct r to a codeword c for which the conditional probability P(c is sent 
| r is received), c € C, is largest. 

•Incomplete Maximum Likelihood Decoding ( IMLD ): If an n-tuple r is 
received, and there is a unique codeword c € C such that d(r, c ) is a minimum, 
then correct r to c. If no such c exists, then report that errors have been 
detected, but no correction is possible. 

• Complete Maximum Likelihood Decoding ( CMLD ): If an n-tuple r is 
received, and there is a unique codeword c £ C such that d(r, c) is a minimum, 
then correct r to c. Otherwise, arbitrarily select one of the codewords c £ C 
that is the closest to r, and correct r to c. 
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Facts: 

1. For any fixed probability distribution of the source messages, the probability of a 
decoding error, given that an n-tuple r is received, is minimized by MED among all 
decoding schemes. 

2. MED has the disadvantage that the decoding algorithm depends on the probability 
distribution of the source messages. The decoding strategy that is used in practice is 
CMLD. 

3. Suppose that the probability that a symbol is transmitted incorrectly in a q- ary 
symmetric channel is p, where 0 < p < ! 1 . Let r be a received word and c \ , c-i € C 
with d(ci,r) = d\ and d(c 2 ,r) = g© Let P(r \ c) denote the probability that r is 
received, given that c was sent. Then P(r \ C\) < P{r | C 2 ) if and only if d\ > c© 

4. CMLD chooses a codeword c for which the conditional probability P(r is received | 
c is sent), c € C, is largest. 

5. If all source messages are equally likely, then CMLD performs in exactly the same 
way as MED. 


14.2.3 THE NOISY CHANNEL CODING THEOREM 
Definitions: 

Let C be an [n, M]-code, each word occurring with equal probability. Let r, be the 
probability of making an incorrect decision using complete maximum likelihood decoding 
given that the ith. codeword was transmitted. The error probability of the code C is 
— m z2j = 1 r r 

Let parameters n and M be fixed. Define P*(n,M,p) to be the smallest error proba- 
bility Pq of any [n, Af]-code using a BSC with symbol error probability p. 


Facts: 

1. Shannon’s noisy channel coding theorem : Let C(p) denote the capacity of a BSC 
with symbol error probability p, and define the quantity M n = 2l fln J . If 0 < R < C(p), 
then P* (n, M n , p) — » 0 as n — * 00 . 

2. By Shannon’s noisy channel coding theorem, arbitrarily reliable communication with 
a fixed information rate is possible on a channel provided that the information rate is less 
than the channel capacity. Unfortunately, all proofs of the theorem are non-constructive 
and hence does specify how to construct such codes. Moreover, the good codes promised 
by the theorem may have very large word lengths. 


14.3 LINEAR CODES 

Linear codes are an important type of codes with a particular type of structure. In 
particular, a linear code is a code that is a subspace of a finite-dimensional vector space 
over a finite field. The main advantages of using linear codes arise from the efficient 
procedures for correcting errors. These procedures are based on matrix computations 
that can be carried out easily and rapidly. 
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14.3.1 INTRODUCTION 


Definitions: 

Let F q denote the vector space of all n-tuples having components from the finite field F q 
(§5.6.3). The elements of F q are called vectors or words. 

An ( n, k)- linear code C over F q is a fc-dimensional subspace of F q over F q . More 
precisely, C is a linear block code , but the qualification “block” is generally omitted. 
The code C is referred to as an (n, k, d)- code, where n is the length of the code, k is 
the dimension of the subspace, and d is the distance. 

The Hamming weight of a word v £ F q is the number of nonzero coordinates in v. 

Let C be an (n, fc)-code over F q . A generator matrix G for C is a k x n matrix with 
entries from F q whose rows form a basis for C. 

If an (n, /c)-code C has a generator matrix of the form G = [Ik | A] , then C is called a 
systematic code , and the generator matrix G is said to be in standard form. 

Let x = (xi, X 2 , ■ ■ ■ , x n ) and y = (j/i, y^, ■ ■ ■ , y n ) be two vectors in F q . The inner 
product of x and y is the field element x o y = x-iiji. If x o y = 0, x and y are 

orthogonal. 

Let C be an (n, ©-code over F q . The orthogonal complement of C, denoted C' J 
(read “C perp”), is the set of vectors orthogonal to every vector in C: 

C 1 - = { x £ F q | x o y = 0 for all y £ C } . 

C' _L is usually called the dual code of C. 

A parity check matrix for an (n, ©-code C is a generator matrix for C L . 

A linear code C is self-orthogonal if C C C© It is self-dual if C = C ± . 

Facts: 

1. Round parentheses (used to denote an (n, k )- code or an (n, k. d)- code) denote that a 
code is linear, while square brackets (used to denote an [n, M]-code or an [n, M, d]-code 
as defined in §14.2.1) are used for all codes, linear or not. 

2. An (n, fc)-code over F q , the finite field of q elements, is an [n, (j fc ]-block code. 

3. The information rate of an (n, fc)-code is R = ^ . 

4. The distance of a linear code C is the minimum Hamming weight of a nonzero vector 
in G. 

5. A linear code is often described by its generator matrix. 

6. A linear code can have many different generator matrices. 

7. If G is a generator matrix for a code, then any matrix obtained from G by applying 
a sequence of elementary row operations is also a generator matrix for that code. 

8. Let C be an (n,k )~ code over F q . Then there exists an equivalent code C' with 
generator matrix [/;- 1 A], where Ik is the k x k identity matrix, and A is a k x (n — k) 
matrix with entries from F q . 

9. If G is a generator matrix for an (n, fc)-code C, then C = { mG \ m £ F q }. The 
source messages can be taken to be the elements of F q , and hence encoding is simply 
multiplication by G. Systematic codes are advantageous because if G is in standard 
form and c = mG is the codeword corresponding to a message m, then the first k 
components of c are identically m. 
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Algorithm 1 : Constructing a parity check matrix H from a generator 
matrix G. 

G' := the reduced row echelon form of G {use elementary row operations} 

A := the k x (n — k) matrix obtained from G' by deleting the leading columns 
of G' 

H := the (n — k) x n matrix H obtained by placing, in order, the rows of — A in 
the columns of H which correspond to the leading columns of G' , and 
placing in the remaining n — k columns of H , in order, the columns of the 
(n— k) x (n— k) identity matrix I n -k 


10 . If C is an (n, fc)-code over F q , then C 1 - is an (n, n — k)- code over F q . 

11. If C is an (n, fc)-code over F q , then the dual code of C 1 - is C itself. 

12. An interesting and useful way to describe an (n, fc)-code is in terms of C ± . 

13 . There are many important special types and families of linear codes, including 
Hamming codes (§14.3.4), Golay codes (§14.4.2), Reed-Muller codes (see Chapter 4 in 
[Vava89] for details) and cyclic codes (§14.3.5). Among cyclic codes, BCH codes form 
an important class (§14.3.6) and among BCH codes there is an important class of codes 
known as Reed-Solomon codes (§14.3.7). 

14 . Reed-Muller codes were used by the Mariner 9 spacecraft on its mission to Mars. 
A Golay code was used by the Voyager 2 on its mission to Jupiter and Saturn. A Reed- 
Solomon code was used by the Voyager 2 on its mission to Uranus. (See [Vava89] for 
more details on these applications.) 

15 . Algorithm 1 uses linear algebra to construct a parity check matrix for a linear code 
from a generator matrix. 

16 . Parity check matrices: Let C be an (n, k)- code over F q with a generator matrix G, 
and let H be a parity check matrix for C . 

• A vector x £ F™ belongs to C if and only if xH T = 0; it follows that GH T = 0. 

• If G = [Ik | A] is a generator matrix for C, then H = [— A T \ I n -k] is a parity 

check matrix for C. 

• C has distance at least s if and only if every set of s — 1 columns of H are linearly 

independent over F q \ in other words, the distance of C is equal to the smallest 
number of columns of H that are linearly dependent over F q 

17. Let G be an (n, /c)-code with generator matrix G. C is self-orthogonal if and only 
if GG t = 0. 

18 . Let C be an (n, fc)-code with generator matrix G. C is self-dual if and only if it is 
self-orthogonal and fc = f (and hence n is even). 

Examples: 

1 . Let C be a binary (7,4)-code with generator matrix 

0 1 0 1 0 1\ 

10 0 10 11 
0 1 0 0 1 1 I ' 

110111/ 
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Elementary row operations yields the reduced row echelon form of G: 

/I 1 0 0 0 1 0\ 

r/ _[0010010| 
G ~looooiior 
Vo o o o o o 1/ 

The leading columns of G' are columns 1, 3, 5 and 7, and 

n o ix 
A= ooi 
0 0 1 ' 

Vo o o/ 

Hence, the following parity check matrix is obtained 

/I 1 0 0 0 0 0\ 

H= 0001000. 

\1 010110 / 

2. The extended Hamming code of order 3 is a binary (8,4,4)-code with generator 
matrix 



f 1 

0 

0 

0 

1 

1 

0 

1 

\ 

G = 

0 

1 

0 

0 

1 

0 

1 

1 


0 

0 

1 

0 

0 

1 

1 

1 

J 


Vo 

0 

0 

1 

1 

1 

1 

0 

The code is self-dual since GG 1 

" = 

0. 









14.3.2 SYNDROME DECODING 

Syndrome decoding is a general decoding technique for linear codes that is useful if the 
information rate of the code is high. Let C be an (n, fc, d)-code over F q with parity 
check matrix H . 

Definitions: 

For any x € F q , the coset of C determined by x is the set C + x = { c + x | c£ C }. 
For any x £ F q , the syndrome of x is the vector xH T . 

A coset leader of a coset of C is one of the coset members of smallest weight. 

Facts: 

1. The coset determined by 0 is C. 

2. For all x € F”, x £ C + x. 

3. For all x, y £ F q , if y £ C + x, then C + y = C + x, that is, each word in a coset 
determines that coset. 

4. The cosets of C partition F q into q n ~ k cosets, each of size q k . 

5. A syndrome is a vector of length n — k. 

6. Two vectors X\ and X 2 £ F q are in the same coset of C if and only if they have the 
same syndrome, that is, x\H T = XiH T . 

7. A vector x £ F™ is a codeword if and only if its syndrome is 0. 
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Algorithm 2: Syndrome decoding for linear codes. 

precomputation: set up a one-to-one correspondence between coset leaders and 
syndromes; let r be a received word and H the parity check matrix. 

compute the syndrome s = rH T of r 
find the coset leader e associated with s 
correct r to r — e 


8. Suppose that a codeword c is transmitted and r is received. If e = r — c, then 
rH T = eH T , which means that the error-vector is in the same coset as the received 
word. By maximum likelihood decoding, the decoder should choose a vector of smallest 
weight in this coset as the error vector. 

9. The fact that there is a one-to-one correspondence between syndromes and coset 
leaders leads to syndrome decoding , a decoding algorithm for linear codes, which is 
described as Algorithm 2. 


Example: 


1. Consider the binary (5,2)-code C with generator matrix 


and parity check matrix 



H = 



The 8 cosets of C are 

{ 00000 , 10001 , 01111 , 11110 } 

{ 01000 , 11001 , 00111 , 10110 } 

{ 00010 , 10011 , 01101 , 11100 } 

{ 10100 , 00101 , 11011 , 01010 } 


0 0 1 \ 


1 0 0 \ 

0 10 . 

0 0 1 / 

{ 10000 , 00001 , 11111 , 01110 } 

{ 00100 , 10101 , 01011 , 11010 } 

{ 11000 , 01001 , 10111 , 00110 }' 

{ 01100 , 11101 , 00011 , 10010 } 


The following is a list of coset leaders and their syndromes: 

coset leader 00000 10000 01000 00100 00010 11000 10100 01100 
syndrome 000 001 111 100 010 110 101 011 

If the word r = 01101 is received, compute the syndrome 01101 • H T = 010, which 
corresponds to a coset leader e = 00010. Hence, r is corrected to r - e = 01111. 


14.3.3 CONSTRUCTING NEW CODES FROM OLD 

There are several methods for modifying a linear code to produce a new linear code. 
Some of these methods are extending a code, puncturing a code, and shortening a code. 

Definitions: 

If C is a linear code of length n over the held F q , then the extended code C of C 
is C = { (ci, c 2 , . . • , c n , c n+ 1 ) | (ci, c 2 , . . . , Cn) G C, Y,'i= i c* = 0 }. The symbol c n+ i is 

called the overall parity check symbol. 

If C is a linear code over F q , the code obtained by removing any column of a generator 
matrix of C is called a punctured C, denoted C*. 
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If G is a linear code of length n, a shortened code C' of G is a linear code of length 
n — 1 which equals the set of all codewords in G having 0 in a fixed coordinate position, 
with that position deleted. 


Facts: 


1. If G is an (n, k, d)- code over F q with generator matrix G and parity check matrix Ft, 
then: 

• C is an (n + 1, k, d)- code over F q \ 

. „ „ . , . , , , ( d, if d is even 

• it G is a binary code, then d = < , , . . , . . . 

J \ d + 1, if d is odd; 

• a generator matrix for C is G, which is obtained by adding a column to G in such 

a way that the sum of the elements of each row of G is 0; 


a parity check matrix for G is H, where Ft = 


/I 1 1 1 


H 


1 \ 
0 

0 

0 J 


2 . Puncturing a code is the reverse process to extending a code. 

3 . If G is an (n,k,d )~ code over F q , then C* is a linear code over F q of length n—1, 
dimension k or k— 1, and distance d or d— 1. 

4 . If G is an (n, k, d)-code over F q , k > 2, and G has at least one codeword for which 
the deleted position has a nonzero entry, then C' is an (n — 1, k — 1, cf)-code over F q , 
with d! > d. 


14.3.4 HAMMING CODES 
Definition: 

A Hamming code of order r over F q , denoted H r (q ), is an (n, A:)-code where n = q 
and k = n — r, with a parity check matrix whose columns are nonzero and such that no 
two columns are scalar multiples of each other. 

Facts: 

1. A decoding algorithm for Hamming codes is shown in Algorithm 3. 

2 . In the binary case (q = 2), the Hamming code H r ( 2) has a parity check matrix 
whose columns consist of all nonzero binary vectors of length r, each used exactly once. 

3 . H r (q) has distance 3, and so is a 1-error correcting code. 

4 . Any two binary Hamming codes of order r are equivalent. 

5 . H r (q) is a perfect code (§14.4.2). 

Example: 

1. Consider Ft 3 ( 2 ), the binary Hamming code of order 3. The code has length n = 7 
and dimension k = 4, and a parity check matrix is 

/I 0 0 1 1 0 1 \ 

H= 0 1 0 1 0 1 1 . 

\0 0 1 0 1 1 1 / 

If the received word is r = 1011101, compute the syndrome s = 1011101 • H T = 001, 
which is the third column of H. Hence e = 0010000, and correct r to 1001101. 
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Algorithm 3: Decoding algorithm for Hamming codes. 

H := a parity check matrix for a Hamming code H r {q) 

r := a received word 

compute the syndrome s = rH T of r 

if s = 0 then accept r as the transmitted word 

else 

compare s T with the columns of H 

if s T = ahi (where hi is the ith column of H) and a £ F q then 

the error vector e is the vector with a in position i and Os elsewhere 
correct r to c = r — e 


14.3.5 CYCLIC CODES 
Definitions: 

A linear code C of length n is cyclic if whenever (00,01,02, • . . , a n _ 1) is a codeword 
in C, then the cyclic shift (o n _i, ao, oi, . . . , a„_ 2) is also a codeword in C. 

Let g(x) be a polynomial in F q [x]/(x n — 1 ). The ideal generated by g(x), namely 
{a(x)g(x) | a(x) £ F q [x\/(x n — 1 ) }, is called the code generated by g(x), and de- 
noted (g(x)). 

Let C be a nonzero cyclic code in F q [x]/{x n — 1 ). A monic polynomial g(x) of least 
degree in C is called a generator polynomial of C . 
x n — l 

The polynomial h(x) = — is called the check polynomial of C. 

9 {x) 

Let H be a parity check matrix for a cyclic code. If r is a received word, the syndrome 
polynomial of r is the polynomial s(x) corresponding to the syndrome s = rF[ T . 


Facts: 

1 . The study of cyclic codes is facilitated by the attachment of some additional algebraic 
structure to the vector space F q . 

2 . If the vector (ao, 01, 02, . . . , a n -i) hr F q is identified with the polynomial ao + aia: + 
CL2X 2 + • • • + a n - ix^ 1 , then: 

• the ring F q [ x]/{x n — 1 ) can be viewed as a vector space over F q , 

• the vector spaces F q and F q [x\/(x n — 1 ) are isomorphic; 

• multiplication of a polynomial in F q [x\/(x n — 1 ) by a; corresponds to a cyclic shift 

of the corresponding vector; 

• a linear code C in the vector space F q is cyclic if and only if C is an ideal in the 

ring F q [x\/(x n — 1 ). 

3 . An ideal may contain many elements which will generate the ideal. One of these 
generators is singled out as the generator. 


© 2000 by CRC Press LLC 




4. If g(x) is a generator polynomial of a cyclic code C, then g{x) generates G\ that is, 
{g(x)} = C. 


5. The following are consequences of the fact that the ring F q [x]/( x n — 1) is a principal 
ideal domain (§5.4.5). Here C is a nonzero cyclic code in F q [x\/(x n — 1) with generator 
polynomial g(x). 

• the generator polynomial of C is unique; 


• g(x) divides x n — 1 in F q [x\; 

• if the degree of g(x) is n — k, that is, g(x) = go + g±x + g 2 x 2 + • • ■ + g n -kX n ~ k (and 

g n -k = 1), then a basis for C is {g{x), xg(x), x 2 g(x ), . . . , x k ~ 1 g(x)}; hence C 
has dimension k and a generator matrix for C is 


/So Si S 2 Sn-fe 0 0 ••• 0 \ 

0 SO Si Sn— k — 1 9n— k 0 * * * 0 

0 0 SO gn—k — 2 gn—k— 1 gn—k ' ' ' 0 

V 0 0 • • • 0 g 0 ■■■ ■■■ g n -k / 


6. Any c(x) € C can be written uniquely as c{x) = f{x)g{x) in the ring F q [x\, where 
f(x) G F q [x] has degree less than k. Hence, encoding a message polynomial f(x) consists 
simply of polynomial multiplication by g{x). 


7. The dual code is also cyclic. 

8 . Let h(x) = ho + h\X + h 2 x 2 + • • • + hkX k 


x n — 1 

— in F„\ x\. Then the reciprocal 
9{ x ) 


polynomial h*(x) = x k h( i) of h(x) is a generator of C _L . (In fact, (-^)h*(x) is the 
generator polynomial of C'- L .) Hence, a parity check matrix for C is 


/ hk hk - 1 hk - 2 
0 hk hk - i 
0 0 hk 


h 0 0 0 • • • 0 \ 

hi ho 0 • • • 0 

h 2 hi ho • ■ ■ 0 


V 0 0 • • • 0 h k 


ho) 


9. A cyclic code of length n over F q is characterized by its generator polynomial. 

10. There is a one-to-one correspondence between cyclic codes in F q and monic poly- 
nomials in F q [x\ which divide x n — 1. 

11. Table 1 gives the complete factorization of x n — 1 over F 2 for some small values of 
odd n. 


12. If C is an (n, fc)-cyclic code generated by g(x), then another parity check matrix 
for C is the matrix iJ whose ith column is x l mod g{x) : for i = 0, 1, . . . , n — 1. 

13. If r(x) is the polynomial corresponding to the received word r, then the syndrome 
polynomial of r is simply s(x) = r(x) mod g{x). 


Example: 

1. Over F 2 , the factorization of a; 7 — 1 is a: 7 — 1 = (1 + a;)(l + x + a; 3 )(l + x 2 + x 3 ). 
The monic divisors of a: 7 — 1 are: 
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Table 1 Factorization of x n — 1 over F 2 , n odd, 1 < n < 31. 


n 

factorization of x n — 1 over F 2 

1 

1 + X 

3 

(1 + x)(l + X + X 2 ) 

5 

(1 + x)(l + x + x 2 + a; 3 + a: 4 ) 

7 

(1 + x)(l + x + a; 3 )(l + x 2 + x 3 ) 

9 

(1 + x)(l + x + a; 2 )(l + a; 3 + a; 6 ) 

11 

(1 + x)(l + x + x 2 + ■ ■ ■ + a; 10 ) 

13 

(1 + x)(l + x + x 2 + ■ ■ ■ + a; 12 ) 

15 

(1 + x)(l + x + a; 2 )(l + x + x 2 + x 3 + x 4 )(l + x + a; 4 )(l + a; 3 + a; 4 ) 

17 

(1 + x)(l + x + x 2 + x 4 + x 6 + x 7 + a: 8 )(l + a; 3 + a; 4 + a; 5 + a; 8 ) 

19 

(1 + x)(l + x + x 2 + ■ ■ ■ + a; 18 ) 

21 

(1 + x)(l + x + a; 2 )(l + a; 2 + a; 3 )(l + x + a; 3 )(l + a: 2 + a: 4 + a; 5 + a: 6 ) 

(1 + x + x 2 + x 4 + X 6 ) 

23 

(1 + x)(l + x + x 5 + x 6 + x 7 + x 9 + a; 14 )(l + a: 2 + a; 4 + a; 5 + a; 6 + a; 10 + a: 11 ) 

25 

(1 + x)(l + x + x 2 + a; 3 + a: 4 )(l + a; 5 + a; 10 + a: 15 + a; 20 ) 

27 

(1 + x)(l + x + a ,2 )(l + a; 3 + a; 6 )(l + x 9 + a; 18 ) 

29 

(1 + x)(l + x + x 2 + ■ ■ ■ + a: 28 ) 

31 

(1 + x)(l + a; 2 + a; 5 )(l + a; 3 + a: 5 )(l + x + x 2 + x 3 + x b )(l + x + x 2 + x 4 + x 5 ) 
(1 + x + x 3 + x 4 + x 5 )(l + x 2 + x 3 + x 4 + a: 5 ) 


9i{x) = 1 
g 2 {x) = 1 + x 
g 8 {x) = 1 + x + a’ 3 
g*i{x) = 1 + x 2 + x 3 

g$(x) = (1 + x)(l + x + x 3 ) = 1 + x 2 + x 3 + x 4 
ge(x) = (1 + x)(l + x 2 + x 3 ) = 1 + x + x 2 + x 4 
g t(x) = (1 + x + a: 3 )(l + x 2 + a: 3 ) = 1 + x + x 2 + x 3 + x 4 + x 5 + x 6 
g 8 (x) = l + x 7 

The polynomial <75 (a:) generates the binary (7,3)-cyclic code 

c = {0000000, 1011100, 0101110, 0010111, 1001011, 1100101, 1110010, 0111001}. 


A generator matrix for C is 

/l 0 1 1 1 0 

< 3=010111 
\0 0 1 0 1 1 

A parity check matrix for C is 

/I 1 0 1 0 0 

[011010 
^“looiioi 
Vo 0 0 1 1 0 



© 2000 by CRC Press LLC 




Algorithm 4: Decoding algorithm for BCH codes. 


Suppose a codeword c is transmitted and r is received. 

Compute Sj = r(/3 a+J ) for j = 0, 1, . . . , S — 2, and form the polynomial S(z) = 

Use the extended Euclidean algorithm to calculate the greatest common divisor 
of S(z) and in the ring F q m[z]\ stop as soon as the remainder rj(z) 
has degree < this yields polynomials Si(z) and U(z) such that 
Sifyz 6 - 1 + ti(z)S(z) = Vi(z ); a(z) := U(z); w(z) := r,(z) 

Find B , the set of roots of a(z) in F q m {the roots will actually lie in the 
subgroup of F* m generated by (3} 

For each 7 € B, set E 1 = where o' (z ) denotes the formal derivative 

of a(z). 

' 0, if /?"* /<B, 


The error vector is e = (eo, e± 
decode r to 1 — e 


,e„_i), where e t = 




if /?-* = 7 e B 


{it is assumed that the number of errors is l < if the number of errors is 

such, then the decoding is correct} 

{there are more efficient ways of obtaining cr(z) and w(z) than by using the 
Euclidean algorithm, for example by using the Berlekamp-Massey algorithm 
(see [MevaVa96])} 


14.3.6 BCH CODES 
Definitions: 

Let /3 be a primitive nth root of unity in an extension field of F q . Let g(x) be the least 
common multiple of the minimal polynomials over F q of f3 a , /3 a+1 , . . . , /3 a+s ~ 2 where a is 
an integer. The cyclic code of length over F q with generator polynomial g(x) is called a 
BCH code (after its discoverers: R. C. Bose, D. Ray-Chaudhuri, and A. Hocquenghem) 
with designed distance S. 

If a = 1 in the definition of a BCH code, the code is called narrow-sense. If n = q m — 1 
for some positive integer m (that is, /? is primitive in F q m), the code is primitive. 


Facts: 

1. BCH codes are special types of cyclic codes, discovered by A. Hocquenghem in 1959 
and independently by R. C. Bose and D. K. Ray-Chaudhuri in 1960. 

2. BCH bound : Let C be a BCH code over F q with designed distance S. Then C has 
distance at least 5. 

3. Algorithm 4 is one method for decoding BCH codes. In the algorithm, g(x) be 
a generator polynomial for a BCH code over F q of designed distance 5 and length n. 
Hence g{x) = lcm{ m z (x) |a<i<a + 5 — 2}, where rrii{x) is the minimal polynomial 
of over F q , and (3 is a primitive nth root of unity in an extension field F q m. 
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Table 2 Elements of F 3 3 as powers of a, where a is a root of f(x) = l + 2x 2 + x 3 . 


i 

a 1 

i 

a 1 

i 

a 1 

0 

1 

9 

2 ~b 2a H - 2a 2 

18 

1 + a 

1 

a 

10 

1 + 2a + a 2 

19 

a + a 2 

2 

a 2 

11 

2 ~b a 

20 

2 + 2a 2 

3 

2 + a 2 

12 

2 a + a 2 

21 

1 + 2a + 2a 2 

4 

2 + 2 a + a 2 

13 

2 

22 

1 ~b a ~b 0 ? 

5 

2 2a 

14 

2a 

23 

2 + a + 2a 2 

6 

2a + 2a 2 

15 

2a 2 

24 

1 ~b 2a 

7 

1 + a 2 

16 

1 + 2a 2 

25 

a + 2a 2 

8 

2 + a + a 2 

17 

1 H - a ~b 2a 2 




Examples: 

1. Consider the finite field F 3 3 generated by a root a of the primitive polynomial 
f(x) = 1 + 2a ; 2 + x 3 £ F 3 [a;] . A table of powers of a is given in Table 2. 

The element (3 = a 2 is a primitive 13th root of unity in F 3 3. If rrii(x) denotes the 
minimal polynomial of (3 l over F 3 , then 

mo (a;) = 2 + x 
mi (a;) = 2 + 2a: + 2a: 2 + a; 3 
777-2(3;) = 2 + 2a: + a: 3 
734(1) = 2 + x + 2x 2 + x 3 
7777(3;) = 2 + x 2 + x 3 . 

Since m\(x) = m 3 {x), the polynomial 

g(x) = lcm(mo(x),mi(x),m.2{x),m 3 (x)) = mo(x)mi(x)m2(x) 
has among its roots the elements /3°, (3 1 , (3 2 , and (3 3 . Hence g(x) is a generator polyno- 
mial for a BCH code over F 3 of designed distance <5 = 5 and length n = 13. 

2. Using the BCH code in Example 1, suppose that the decoder received the word 
r = (220 021 110 2110). The following steps follow Algorithm 4 to decode r: 

•Compute S 0 = r(/3°) = 1 , .5\ = r(/3 1 ) = a 14 , S 2 = r((3 2 ) = a 23 , and S 3 = r(/3 3 ) = 
a 16 . This gives S(z) = 1 + a 14 z + a 23 z 2 + a 16 z 3 . 

•Applying the extended Euclidean algorithm in ^33 [^r] to S(z ) and z A yields: 


% 

Si{z) 

ti(z ) 

n(z) 

deg n(z) 

-1 

1 

0 

z 4 

4 

0 

0 

1 

1 + a 14 z + a 23 z 2 + a 16 z 3 

3 

1 

1 

a 17 + a 23 z 

a 17 + a 16 z + a 13 z 2 

2 

2 

a 3 + a 16 z 

a 15 + a 3 z + a 13 z 2 

a 15 + a 16 z 

1 


Stop, since deg(r 2 (^)) < = 2. Hence a(z) = a 15 + a 3 z + a 13 z 2 and 

w(z) = a 15 + a 16 z. 

•By trying all possibilities, find that the set of roots of a(z) is B = {/3 5 ,/3 9 }. 
•Compute Eps = fsy = 2, and Eps = -/ 3 _ 9 y^§ 9 y = 2. 

•Hence, the error vector is e = (000 020 002 0000) and the word r is decoded to 
(220 001 111 2110 ). 
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14.3.7 REED-SOLOMON CODES 


Definition: 

A Reed-Solomon ( RS ) code is a primitive BCH code of length n = q — 1 over F q . 

Facts: 

1. Reed-Solomon codes are special types of BCH codes, and hence they have the same 
encoding and decoding algorithms. 

2. Reed-Solomon codes are important because, for a fixed n and k, no linear code can 
have greater distance. 

3. Reed-Solomon codes are useful for correcting burst errors. (A binary burst of 
length b is a bit string whose only nonzero entries are among b successive components, 
the first and last of which are nonzero.) 

4. A Reed-Solomon code was used to encode the data transmissions from the Voyager 2 
spacecraft during its encounter with Uranus in January, 1986. 

5. If C is an (n, fc)-RS code over F q with designed distance S , then the generator 
polynomial for C has the form g(x ) = (x — /3 a )(x — (3 a+1 ) . . . (x — /3 a+<5-2 ), where /3 is 
a primitive element of F q . 

6. If C is an (n, fc)-RS code over F q with designed distance S, then the distance of C 
is exactly S. 

7. Error correction in compact disks (developed by Philips and Sony) uses a code known 
as the Cross-Interleaved Reed-Solomon Code (CIRC). The CIRC code is obtained by 
cross-interleaving two Reed-Solomon codes, one a (28,24)-RS code and the other a 
(32,28)-RS code. See [Vava89] for more information and further references. 

Example: 

1. Consider the finite field F 5 generated by (3 = 2. Then g(x) = (x — j3){x — /3 2 ) = 
(x — 2)(x — 4) = x 2 + 4a; + 3 generates a (4, 2)-RS code over F 5 with distance <5 = 4. 


14.3.8 WEIGHT ENUMERATORS 
Definitions: 

Let C be an [n, M }- code and let Ai be the number of codewords of weight i in C, for 
* = 0, 1, . . . , n. The vector (A 0 , A ±, . . . , A n ) is called the weight distribution of C. 

Let C be an (n, fc)-code over F q with weight distribution (A 0 , A 1 , . . . , A n ). The weight 
enumerator of C is defined to be the polynomial Wc{z) = Y^i=o ^ * zl ■ 

Facts: 

1. Let C be an (n, fc)-code over F q , and let the symbol error probability on the q - ary 

symmetric channel be p. If C is used only for error detection, then the probability of 
an error going undetected is )*(1 — p) n ~ l ■ 

2. MacWilliams identity : Let C be an (n,k)- code over F q with dual code C' x . Then 

WV (*) = £[! + («-!)*]" Wc( t+^) • 
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Examples: 

1 . The weight distribution of a binary Hamming code of length n satisfies the recurrence 
A) = l> Ai = 0, 

(i + 1 )A. i+ i + Ai + (n — i + 1 ) A,_i = (”) , i > 1. 

2. The weight enumerator of the Golay code (§14.4.2) is 1 + 253z 7 + 506^ 8 + 1288Z 11 + 
1288z 12 + 506© 5 + 2532 16 + 2 23 . 


14.4 BOUNDS FOR CODES 

How many codewords can a code have if its codewords are n-tuples of elements of F q 
and it has distance dl Although this is a difficult question for all but special sets of 
values of n, q , and d, there are several different useful bounds on M, the number of 
codewords in the code. There are also special types of codes, called perfect codes, that 
achieve the maximum number of codewords possible, given values of n, q , and d. 


14.4.1 CONSTRAINTS ON CODE PARAMETERS 
Definitions: 

Let A q (n,d) be the maximum M for which there exists an [n,M,d\- code over F q . A 
code that achieves this bound is called optimal. 

Let Vq^n^d) be the number of words in F q that have distance at most d from a fixed 
word. 

An (n, k, d)-code for which k = n — d + lis called a maximum distance separable 
(. MDS ) code. 

Facts: 

1. Little is known about A q (n,d) except for some specific values of g, n, and d. 

2. For all n > 1, A q (n, 1) = q n and A q (n, n ) = q. 

3. For all n > 2, A q (n,d) < qA q (n — 1 ,d). 

4. If d is even, then A 2 (n 1 d) = A 2 {n — 1, d — 1). 

5. V q (n,d) = Eto (?)(</- !) 1 - 

6. Hamming bound (or sphere-packing bound): If t = A q (n,d) < v ^ n ^ . 

7. Singleton bound: A q (n,d) < q n ~ d+1 . Hence, for any (n, fc, d)-code over F q , k < 
n — d + 1. 

8. Gilbert-Varshamov bound: 

• A q (n,d) > Vq (£,d-i) ’ 

• If V q (n— 1, d — 2) < q n ~ k , then there exists an (n, k , d)-linear code over F q : hence, 

if k is the largest integer for which this inequality holds, then A q (n,d) > q k . 
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9. For thirty years, the asymptotic version of the Gilbert-Varshamov bound (not dis- 
cussed here) was believed to be the best possible lower bound for good codes. In 1982, 
using some sophisticated ideas from algebraic geometry, it was proved that the Gilbert- 
Varshamov bound can be bettered. A good survey of these results appears in [Va92], 

10 . Let C be an (n, &)-MDS code. If G is a generator matrix for C, then any k columns 
of G are linearly independent. 

11 . If C is an (n, fc)-MDS code, then C I_L is also an MDS code. 

12. Johnson bound : If d = 2t + 1, then 

2 n 

A 2 (n,d) < — . 

EO + tAtCHSt-LStJ) 

i—0 Lt+iJ T 

This is an improvement of the Hamming bound (Fact 6) for binary codes. 

Example: 

1. The Reed-Solomon codes (§14.3.7) are MDS codes. 


14.4.2 PERFECT CODES 

Definitions: 

An [n, M, d]-code over F q is said to be perfect if it meets the Hamming bound, that is, 

M = vS hj’ wheiet= LW 

The binary Golay code is a (23, 12, 7)-code over F 2 with generator matrix G = 

[I 12 | A\, where 

/I 101110001 0\ 
10111000101 
01110001011 
11100010110 
11000101101 
. _ 10001011011 
00010110111 ' 
00101101110 
01011011100 
10110111000 
01101110001 
Vi 1111111111/ 

The ternary Golay code is an (11, 6, 5)-code over F$ = {0, 1, 2} with generator matrix 

G = [Iq\B], where 

/ 1 1 1 1 1 \ 

0 12 2 1 

10 12 2 
21012 ' 

2 2 10 1 
\1 2 2 1 0/ 
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Facts: 

1. A necessary condition for a code to be perfect is that d be odd. 

2 . The binary Golay code is a perfect code. 

3 . The extended binary Golay code is a (24, 12, 8)-code that is self-dual. 

4 . The ternary Golay code is a perfect code. 

5. The set of all perfect codes over F q , determined in 1973 by Aimo Tietavainen, 
consists of the following: 

• the linear code consisting of all words in 

• the binary repetition codes of odd lengths; 

• the Hamming codes and all codes of the same parameters as them; 

• the binary Golay code and all codes equivalent to it; 

• the ternary Golay code and all codes equivalent to it. 

6. There do exist perfect codes with the same parameters as the Hamming codes, but 
which are not equivalent to them. 


14.5 NONLINEAR CODES 

Although linear codes are studied and used extensively, there are several important 
types of nonlinear codes. In particular, there are nonlinear codes with efficient encoding 
and decoding algorithms, as well as nonlinear codes that are important for theoretical 
reasons. 


14.5.1 NORDSTROM-ROBINSON CODE 
Definitions: 

Permute the coordinates of the extended binary Golay code so that one of the weight 8 
codewords is 1111111100 ... 0, and call this new code C' . For each of the 8-bit words 
00000000 , 10000001 , 01000001 , 00100001 , 00010001 , 00001001 , 00000101 , 00000011 , 
there are exactly 32 codewords in C' that begin with that word. The extended 
Nordstrom-Robinson code is the code whose codewords are obtained from these 256 
words by deleting the first 8 coordinate positions. The Nordstrom-Robinson code 
is obtained by puncturing the last digit of the extended Nordstrom-Robinson code. 

Facts: 

1. The extended Nordstrom-Robinson code is a binary [16, 256, 6]-nonlinear code. 

2 . The Nordstrom-Robinson code is a binary [15, 256, 5]-nonlinear code. 

3. The Johnson bound (§14.4.1, Fact 12) yields ^(15,5) < 256, and hence it follows 
that ^(15, 5) = 256. On the other hand, it has been proved that no linear code of 
length 15 and distance 5 has more codewords than the binary 2-error correcting BCH 
code, which has 128. 
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14.5.2 PREPARATA CODES 
Definitions: 

The Preparata codes are an infinite family of nonlinear codes that have efficient 
encoding and decoding algorithms. Let /3 be a primitive element of F 2 m , and label the 
elements of F ' 2 m as on = /3\ 0 < i < 2 m — 2, and a 2 m -i = 0. For a subset X C F 2 », 
let x(X) denote the characteristic vector of X\ that is, x(X) is a binary vector of 
length 2 m whose itli coordinate is 1 if a* £ X and 0 otherwise, for each 0 < i < 2 m — 1. 

If m > 3 is odd, the extended Preparata code P(m ) is the set of words of the form 
(x(X),x(F)), where X and Y are subsets of F 2 m such that: 

• |X| and |y| are even; 

• T,xex x = T,yeY V\ 

• Ezex 2:3 + (E xe x x ) 3 = HyeY V 3 - 

The Preparata code P(m) is obtained from P(rn) by puncturing the coordinate cor- 
responding to the field element 0 in the first half of each codeword. 

Facts: 

1. If m > 3 is odd, then P(m) is a binary nonlinear code with parameters n = 2 m+1 , 
M = 2 2m+1 — 2m — 2 , d = 6. 

2. If to > 3 is odd, then P(rn ) is a binary nonlinear code with parameters n = 2 m+1 — 1, 
M = 2 2m+1 - 2m - 2 , d= 5. 

3. P(3) is the same as the Nordstrom-Robinson code. 

4. The Preparata codes can be viewed as linear codes over Z 4 . 


1 4.6 CONVOLUTIONAL CODES 

Convolutional codes are a powerful class of error-correcting codes. They work differently 
than block codes do. Instead of grouping message symbols into blocks for encoding, 
check digits are interwoven within streams of information symbols. Convolutional codes 
can be considered to have memory, since n symbols of information are encoded using 
these n symbols and previous information symbols. 


14.6.1 BACKGROUND 
Definitions: 

The figure in §14.1.1 can be used to distinguish two approaches to decoding. For a hard 
decision decoder, the demodulator maps received coded data symbols into the set of 
transmitted data symbols (for example, 0 and 1). In contrast, the demodulator of soft 
decision decoders may pass extra information to the decoder (for example, 3 bits of 
information for each received channel data symbol, indicating the degree of confidence 
in its being a 0 or a 1). 


© 2000 by CRC Press LLC 



Facts: 


1. Convolutional codes were introduced by P. Elias in 1955, and are widely used in 
practice today. 

2. Convolutional codes are used extensively in radio and satellite links and have been 
used by NASA for deep-space missions since the late 1970s. 

3. There are linear codes that differ from block codes in that the codewords do not 
have constant length. 

4. Convolutional codes also differ from block codes in that the n-tuple produced by an 
encoder depends not only on the message fc-tuple it, but also on some message fc-tuples 
produced prior to it; that is, the encoder has memory. 

5. Soft decision decoding typically allows performance improvements. 

6. Hard and soft decision techniques can be used in both block and convolutional 
codes, although soft decision techniques can typically be used to greater advantage in 
convolutional codes. 

7. Theoretical results, particularly with respect to BCH codes, position block codes as 
superior to convolutional codes. 

8. The minimum distances of BCH codes are typically much larger than the corre- 
sponding free distances (§14.6.3) of comparable convolutional codes. 

9. Decoding techniques for block codes are generally applicable only to q- ary (or binary) 
symmetric channels, which are an appropriate model for only a relatively small fraction 
of channels that arise in practice. 

10. Efficient decoding of BCH codes requires hard-decision decoding, which suffers 
information loss relative to soft-decision strategies, precipitating a performance penalty. 
The resulting performance of the BCH decoder is significantly inferior to that for a 
comparable convolutional code, despite the BCH codes being inherently more powerful. 
Consequently, convolutional codes are used in a majority of practical applications, due 
to their relative simplicity and performance, and the large number of communication 
channels which benefit from soft decoding techniques. 

11 . A recently developed classes of codes, known as turbo codes , are built using con- 
volutional codes. The basic idea behind a turbo encoder is to combine two simple 
convolutional encoders. Input to the encoder is a block of bits. The two constituent 
encoders generate parity bits and the information bits are sent unchanged. The key 
innovation is an interleaver, which permutes the original information bits before they 
are provided as input to the second encoder. The permutation causes input sequences 
which produce low-weight codewords for one encoder to generally produce high-weight 
codewords for the other encoder. See [HeWi99] for information on turbo codes. 

12. A good starting point for information on turbo codes is the JPL Turbo Codes Web 
page: 

http : //www331 . jpl .nasa.gov/public/ JPLt codes .html 


Example: 

1 . In the simplest version of soft decision decoding, known as the binary erasure channel 
(and usually classified as a hard-decision technique), the demodulator output is one 
of three values: 0, 1, or “erasure” (indicating that neither a 0 nor a 1 was clearly 
recognized) . 
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14.6.2 SHIFT REGISTERS 


Definitions: 

An m-stage shift register is a hardware device that consists of m delay elements 
(or Hip-Hops), each having one input and one output, and a clock which controls the 
movement of data. During each unit of time, the following operations are performed: 

• a new input bit and the contents of some of the delay elements are added modulo 2 

to form the output bit; 

• the content of each delay element (with the exception of the last delay element) 

is shifted one position to the right; 

• the new input bit is fed into the first delay element. 

The generator of an m-stage shift register is a polynomial g( x) = 1 + g\X + g 2 x 2 * + 

• • • + g m x m € F‘ 2 [x] , where gi = 1 if the contents of the ith delay element is involved in 
the modulo 2 sum that produces the output, and 0 otherwise. 

Fact: 

1. Assume that the initial contents of a shift register are all Os. Suppose that a shift 
register has generator g(x). Let the input stream uq, u\, U 2 , ■ ■ ■ be described by the 
formal power series u(x) = uq + u\X + u 2 x 2 + • • • over F 2 . (If the input stream is finite of 
length t, let it* = 0 for i > t.) Similarly, let the output stream Co, Ci, c 2r , . . be described 
by the formal power series c(x ) = Cq + C\X + c 2 x 2 + • • • over F 2 . Then c(x) = u{x)g{x). 

Examples: 

1. Shift-example: Suppose that the delay elements of the 4-stage shift register in the 
following figure initially contain all Os: 


Input 



If the input stream to the register is 11011010 (from left to right), the updated contents 
of the delay elements and the output bits are summarized in the following table: 


time 

input 

D: 

D 2 

Da 

D a 

output 

0 

- 

0 

0 

0 

0 

- 

1 

1 

1 

0 

0 

0 

1 

2 

1 

1 

1 

0 

0 

0 

3 

0 

0 

1 

1 

0 

1 

4 

1 

1 

0 

1 

1 

1 

5 

1 

1 

1 

0 

1 

1 

6 

0 

0 

1 

1 

0 

0 

7 

1 

1 

0 

1 

1 

1 

8 

0 

0 

1 

0 

1 

0 


2. The generator of the shift register in Example 1 is g(x) = 1 + x + x 4 . 
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14.6.3 ENCODING 


Note: Throughout this subsection assume that the initial contents of a shift register 
are all Os. 

Definitions: 

An (n, 1 ,m)~ convolutional code with generators gi(x), g 2 (x), ■ ■ ■ ,g n (%) € F 2 [x\ (to = 
max(deg gi(x))) consists of all codewords of the form c(x) = (ci(x), c 2 (x ), . . . , c n (x)), 
where Ci(x) = u(x)gi(x), and u(x) = Uo + U\X + u 2 x 2 + • • • represents the input stream. 
The system memory of the code is to. 

A convolutional code is catastrophic if a finite number of channel errors can cause an 
infinite number of decoding errors. 

The rate of an (n, k, m)-convolutional code is © 

The free distance d f ree of a convolutional code is the minimum weight of all nonzero 
output streams. 

Facts: 

1. A convolutional code is linear. 

2. Convolutional codes are not block codes since the codewords have infinite length. 
They are, however, similar to block codes, and in fact can be viewed as block codes over 
certain infinite fields. 

3. An (n, 1, m)-convolutional code can be described by a single shift register with n 
outputs, where c* (x) is the output of the single-output shift register with generator gi(x) 
when u(x) is the input. In practice, ci(x), c 2 (x ), . . . , c n (x) are interleaved to produce 
one output stream. 

4. Let C be an (n, 1, m)-convolutional code with generators gi(x), g 2 (x ), . . . , g n (x). 
Let G(x) = Elli xl ~ 1 gi( xU )- If the message is u(x), then the corresponding interleaved 
codeword is c(x) = G(x)u(x n ). 

5. The Viterbi algorithm is a maximum likelihood decoding algorithm for convolutional 
codes. See [LiCo83]. For an algebraic treatment of convolutional codes, see [Pi88]. 

6. If gcd(g 1 (x),g 2 (x), . . . ,g„(x)) = 1 in F 2 [x\ then C is not catastrophic. 

7. An (n, k, m)-convolutional code can be described by k multi-output shift registers, 
each of maximum length to. The message is divided into k streams, each stream being 
the input to one of the k shift registers. There are n output streams, each formed using 
some or all of the shift registers. 

8. The free distance of a convolutional code is a measure of the error-correcting capa- 
bility of the code, and is a concept analogous to the distance of a block code. 

9. In contrast to block codes, there are few algebraic constructions known for convo- 
lutional codes. 

10. The convolutional codes used in practice are usually those found by a computer 
search designed to maximize the free distance among all encoders with fixed parame- 
ters n, k, and m. The following table lists the best codes with a rate of \ (n = 2, 
k = 1). The polynomials gi(x) and g 2 (x) are represented by their coefficients, from low 
order to high order. 


© 2000 by CRC Press LLC 



m 

9i(x) 

92(x) 

^free 

2 

101 

111 

5 

3 

1101 

1111 

6 

4 

10011 

11101 

7 

5 

110101 

101111 

8 

6 

1011011 

1111001 

10 

7 

11100101 

10011111 

10 

8 

101110001 

111101011 

12 

9 

1001110111 

1101100101 

12 

10 

10011011101 

11110110001 

14 

11 

100011011101 

101111010011 

15 

12 

1000101011011 

1111110110001 

16 


Examples: 

1. Consider the (2, 1, 3)-convolutional code with generators gi(x) = 1 + s 3 and g 2 (x ) = 
1 + x + x 3 . The code can be described by the shift register of the following figure. The 
message u( x) = 1 + x 3 + x 4 , corresponding to the bit string 10011, gets encoded to 
c(x) = (u(x)qi(x),u(x)a 2 (x)) = (1 + x 4 + x e + x 7 ,1 + x + x 5 + x 6 + x 7 ), or in interleaved 
form to c = 11 01 00 00 10 01 11 11 00 00 00 .... 



2 . Suppose that the input stream contains an infinite number of Is, and the output 
stream has only finitely many Is. If the channel introduces errors precisely in the 
positions of these Is, then the resulting all-zero output stream will be decoded by the 
receiver to m(x) = 0. 

3. The following figure is a shift register encoder for a (3, 2, 3)-convolutional code C. 



If the input stream is u = 1011101101, it is first divided using alternating bits into 2 
streams I\ = 11110 and I 2 = 01011. The 3 output streams are ci = 10010, c 2 = 00011 
and C 3 = 01010 , and the interleaved output is c = 100 001 000 111 010 . 
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14.7 BASICS OF CRYPTOGRAPHY 


Protecting the secrecy of information goes back to ancient times. For example, the 
ancient Romans used a secret code to send messages so the messages could not be read 
by their enemies. In modern times, there is a constant need to protect information 
from unauthorized access and from malicious actions. The science of cryptography is 
devoted to methods that offer such protection. Sending secret messages, authenticating 
messages, distributing secret keys, and sharing secrets are only some of the applications 
addressed by modern cryptography. 


14.7.1 BASIC CONCEPTS 


Definitions: 

Cryptography is the science and study of protecting data from malicious or unautho- 
rized actions, including access to, manipulation, impersonation, and forgery. 

Cryptanalysis is the use of mathematical, statistical, and other techniques to defeat 
cryptographic protection mechanisms. 

Cryptology is the study of both cryptography and cryptanalysis, although “cryptog- 
raphy” is often used in place of “cryptology”. 

A cipher is a method whereby a message in some source language (the plaintext) is 
transformed by a mapping, called an encryption algorithm to yield an output, called 
the ciphertext , which is unintelligible to all but an authorized recipient. 

A recipient of an encrypted message is able to recover the plaintext from the ciphertext 
by use of a corresponding decryption algorithm. 

A key is a secret number or other significant information which parametrizes an en- 
cryption or decryption algorithm. 

The message space M. is the set of all possible plaintexts, the ciphertext space C 
consists of all possible ciphertexts, and the key space 1C consists of all possible keys. 

An encryption algorithm E is a family of mappings parametrized by a key k £ 1C, 
such that each value k defines a mapping E k £ £, where £ is the set of all invertible 
mappings from A! to C. A specific plaintext message in is mapped by Ek to a ciphertext 
c = E k (m). 

The set V of decryption algorithms consists of all invertible mappings from C back 
to Ai, such that for each encryption key k £ 1C, there is some mapping D £ V such that 
D f(k)(Ek(m)) = m for all m £ A4, where f(k ) is some key dependent on k and Df( k ) 
is the decryption algorithm corresponding to the decryption key f{k). For so-called 
symmetric-key systems, this decryption key f(k) is equal to k itself. 

Facts: 

1. Useful books that cover cryptography include [BePi82], [Br88], [DaPr89], [De83], 
[Ko94], [Sc96], [SePi89], [Si93], [St95], and [We88]. A comprehensive treatment can be 
found in [MevaVa96]. 
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2. Useful Internet sites on cryptography include: 

• A-Z Cryptology: 

http : / /www . achiever . com/f reehmpg/ cryptology/crypto . html 

• Cryptographic Software Archive: 

ftp : //ftp . funet . f i/pub/ crypt 

• Introduction to Cryptography: 

http : //www . cs . hut . f i/ssh/ crypto/ intro . html 

• RSA Laboratories: 

http: //www.rsa. com/rsalabs 

• Some Classic Ciphers: 

http : / /rschp2 . anu . edu . au : 8080/ cipher . html 

3. Separating cryptography from cryptanalysis is, in fact, difficult, as the design of 
secure cryptographic systems requires that all possible cryptanalytic attacks be taken 
into account. 

4. Cryptography differs from steganography in that while the former involves use of 
techniques to secure data (for example, codes and ciphers), the latter involves the use of 
techniques which obscure the existence of data itself (for example, invisible ink, secret 
compartments, use of subliminal channels) . 

5. While falling under the broader category of communications security, cryptography 
is generally concerned with the more mathematical details, rather than system-level 
aspects such as traffic-flow analysis and electronic security aspects such as monitoring 
electromagnetic emanations. 

6. Cryptographic mechanisms can be used to support a number of fundamental security 
services, including: 

• privacy: preventing confidential data from being available in an intelligible form 

to unauthorized parties; 

• data integrity: detection of data manipulation by unauthorized parties (includ- 

ing alteration, insertion, deletion, substitution, delay and replay); it should be 
noted that encryption alone does not guarantee data integrity; 

• authentication: corroboration that a party’s identity is as claimed ( entity authen- 

tication ), or that the origin of data is as claimed (data origin authentication ); 
related to this is the assurance that data has not been subjected to unautho- 
rized manipulation (cf. data integrity), possibly including assurances regarding 
uniqueness and timeliness; 

•non-repudiation: provision for the resolution of disputes related to digital sig- 
natures; digital signatures can be used as the basis of authorization of certain 
actions; disputes may occasionally arise subsequently due to either false denials 
(repudiated signatures) or fraudulent claims (forged signatures). 

In addition, entity authentication and/or data authentication may be the basis for 
granting access to certain controlled resources. Access control mechanisms often rely 
upon cryptographic support to restrict access (of information or other resources) to 
authorized parties; access is generally granted upon proof of authorization, which may 
be based on an entity’s identity or possession of anonymous tokens , either physical or 
digital. 
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7 . The traditional objectives of privacy and authentication (although not both required 
in all cases) lead to the following requirements for a cryptosystem: 

• fundamental requirement : (to maintain secrecy of key) it should be infeasible 

for an adversary to deduce the key k given one or more plaintext-ciphertext 
pairs ( m,c ); 

• privacy requirement: (to maintain confidentiality) it should be infeasible for an 

adversary to deduce the plaintext m corresponding to any given ciphertext c; 

• authentication requirement: (to prevent forgery or substitution) it should be 

infeasible for an adversary to deduce a ciphertext c' corresponding to any mes- 
sage in' of his choosing, or corresponding to any other (meaningful) message ru- 
in the encryption model above, the decryption mapping was such that D f^)(Eh(m)) = 
m, and it was stated that in symmetric-key systems the decryption key f(k) is the 
same as (or easily computed from) the encryption key k. In their landmark 1976 paper, 
W. Diffie and M. Heilman introduced the concept of public-key cryptosystems. Here 
each user has his own pair of encryption and decryption keys (k. f(k)) where k ^ f(k), 
with the property that it is infeasible for anyone to deduce f(k ) from k. If the so-called 
public key, k, of a user A is published, then anyone looking up that key can encrypt 
a message for A, such that A alone, having knowledge of f(k), the private key, can 
decrypt it. User A is able to compute both k and f(k) from another key k' . 

8. A digital signature is intended to be the digital analogue of a handwritten signa- 
ture; it should be a number dependent on some secret known only to the signer, and, 
additionally, on the content of the message being signed. 

9 . Signatures must be verifiable in the sense that, should a dispute arise as to whether a 
party signed a document (caused by either a lying signer trying to repudiate a signature 
it did create, or a fraudulent claimant), an unbiased third party can resolve the matter 
equitably, without requiring access to the signer’s secret information (private key). 

10. Signatures must be easy for the signer to computer. 

11 . Signatures must be easy to verify by anyone. 

12. The following describes the general method of constructing digital signatures: A 

has a message to which it wishes to send to B. A sends to B the quantity Df(k) (m) 
obtained by applying the decryption function. Then, upon reception, B can use A’s 
public-key algorithm Ek to recover the message m = Ek(D (Here it is required 

that for each k € 1C, Dfik.) is a mapping from Ai to C, Ek is a mapping from C to A4, 
and Ek is the inverse of Df(k)-) Provided the message to recovered by B is meaningful 
(e.g. contains a sufficient degree of redundancy — to ensure it is not simply the result 
of applying Ek to a random quantity anyone might have generated), B has assurance 
that the message is authentic and originated from A, since by assumption no one aside 
from A knows or can feasibly compute A’s secret key f(k). Moreover, B can keep the 
signature to prove to any third party, at a later point in time, that A actually 

did send the message to; such a party would similarly uses A’s public key to recover to 
as verification. This provides a digital analogue to handwritten signatures. 

13 . Digital signatures are also possible using symmetric-key techniques, but this gen- 
erally requires use of an on-line trusted third party or new keying material for each 
signature ( one-time signature schemes ). For these reasons, digital signatures based on 
public- key cryptography are used in practice. 
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14.7.2 SECURITY OF CRYPTOSYSTEMS 


Definitions: 

Adversaries are either passive or active. Passive adversaries are a threat to confiden- 
tiality; they do not interrupt, alter or insert any transmitted or stored data. Active 
adversaries additionally threaten integrity and authentication. 

There are many models under which one can assume a cryptanalyst is able to attack a 
cryptographic system. The following types of attack can be hypothesized for increasingly 
powerful adversaries: 

• ciphertext-only : the adversary has possession only of some ciphertext; 

• known-plaintext: the adversary has some plaintext and its corresponding 

ciphertext; 

• chosen-plaintext: the adversary has some plaintext of his choosing, and its 

corresponding ciphertext; 

• chosen-ciphertext: the adversary has some ciphertext of his choosing, and its 

corresponding plaintext. 

The most stringent measure of the security of a cryptographic algorithm is uncon- 
ditional security where an adversary is assumed to have unlimited computational 
resources, and the question is whether there is enough information available to defeat 
the system. Unconditional security for encryption systems is called perfect secrecy. 

To measure an adversary’s uncertainty in the key after observing n ciphertext characters, 
C. Shannon defined the key equivocation function Q(n) = U (K\ C'i © . . . C„ ) and 
defined the unicity distance of the cipher to be the first value n = no such that 
Q(n) «0. 

A cryptographic method is said to be provably secure if the difficulty of defeating it 
can be shown to be essentially as difficult as (that is, polynomially equivalent to) solving 
a well-known and supposedly difficult (typically number-theoretic) problem, such as 
integer factorization or the computation of discrete logarithms. (Thus, “provable” here 
means provable subject to as yet unproved assumptions.) 

A proposed technique is said to be computationally secure if the (perceived) level of 
computation required to defeat it exceeds, by a comfortable margin, the computational 
resources of the hypothesized adversary. 

Facts: 

1. It is a standard cryptographic assumption that an adversary will have access to 
ciphertext. 

2. Kerckhoff’s assumption: The security of a system should rest entirely in the se- 
cret key — the adversary is assumed to have complete knowledge of the rest of the 
cryptographic mechanism(s). 

3. In determining whether the security of a particular cryptosystem is adequate for a 
particular application, the powers and resources of the anticipated adversary must be 
taken into account. Potential adversaries may have powers ranging from minimal to 
unlimited. 

4. The security of a cryptographic algorithm can be measured according to several 
different metrics, including unconditional security, provable security, and computational 
security. 
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5. Let M, C, and K be random variables ranging over the message space A4, ciphertext 
space C, and key space 1C. Unconditional security for encryption systems can be specified 
by the condition H(M\C) = H(M)\ that is, the uncertainty in the plaintext, after 
observing the ciphertext, is equal to the a priori uncertainty about the plaintext — 
observation of the ciphertext provides no information (whatsoever) to an adversary. 

6. A necessary condition for an encryption scheme to be unconditionally secure is that 
the key should be at least as long as the message. The one-time pad (§14.8.4) is an 
example of an unconditionally secure encryption algorithm. 

7. In general, encryption schemes do not offer perfect secrecy, and each ciphertext 
character observed decreases the uncertainty in the encryption key k used. 

8. Let no be the unicity distance of a cipher. After observing no characters, the key 
uncertainty is zero, meaning an information-theoretic adversary can narrow the set of 
possible keys down to a single candidate, thus defeating the cipher. 

9. The computational security measures the amount of computational effort required, 
by the best currently-known attacks, to defeat a system; it must be assumed here that 
the system has been well-studied to determine which attacks are relevant. 


14.8 SYMMETRIC-KEY SYSTEMS 

Classical cryptosystems such as the Caesar cipher have the property that the decryption 
key can easily be found from the corresponding encryption key. Until recently, all 
cryptosystems had this general property. Such systems are known as symmetric-key 
systems, to distinguish them from cryptosystems that do not have this property, that 
is, where knowing an encryption key does not provide adequate information for finding 
the corresponding decryption key. 


14.8.1 REDUNDANCY 
Definitions: 

Let plaintext messages be composed from an alphabet A of L characters. Let H n denote 
the entropy (§14.1.2) of ?r-character messages, or equivalently, the expected bit length 
of n-character messages under an optimal encoding. 

The absolute rate R of a language is the maximum number of bits of information that 
each character could encode, assuming all combinations of characters are equiprobable 
in the language. R = log 2 L. 

The rate r n for n-character messages is r n = U r " . 

The rate r ^ of a language is = lim r n . 

n—> oo 

The redundancy D n for n-character messages is D n = n(R — r „ ) . 

The redundancy D of a language (measured in bits per plaintext character) is 
D = lim = R- r^. 

n—>oo ,L 

Let D be a family of parametrized decryption algorithms, I\ be a random variable 
ranging over keyspace /C, and C be a random variable ranging over ciphertext space C. 
A random cipher is a cipher such that the decipherment Dk(C) is a random variable 
uniformly distributed over all preimages of C. 
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Facts: 


1. Knowing the rate and redundancy of a language allows estimation of the unicity 
distance (§14.7.2 Fact 8) for a certain class of “statistically perfect” ciphers known as 
random ciphers. 

2 . A reasonable estimate of unicity distance for a random cipher is no = H ^ cipher- 
text characters, where D = R~ is the redundancy of the language and H(K) is the 
key entropy. 

3 . The redundancy in the denominator of this estimate indicates that data compression 
prior to encryption increases the unicity distance of a cipher (increasing security) . 

Examples: 

1. Estimates for the English language with 26-character alphabet indicate r\ « 4.2, 
7*2 « 3.6, and r 3 ss 3.2 bits/character; these estimate the number of bits of information 
per character in messages of lengths one, two and three characters. 

2 . The rate r\ differs from R = log 2 26 ss 4.7 due to the fact that characters in English 
messages are not equiprobable; r n decreases as n grows due to the decreasing likelihood 
that random character strings are meaningful messages — effectively due to redundancy 
in the language. 

3 . Estimates suggest that for English, 1 < < 1.5, yielding a redundancy of be- 

tween 3.2 and 3.7 bits per character in long messages, or between 68% and 79%. 


14.8.2 SUBSTITUTION AND TRANSPOSITION CIPHERS 


Definitions: 

Let P = P 1 P 2 ■ ■ - Pn represent a plaintext message of n characters; often pi are interpreted 
as integers 0, 1, . . . , 25 corresponding to the characters a,b,...,z. 

A simple substitution cipher S replaces each plaintext character by a fixed substitute 
according to a permutation 7r on the source alphabet. This means that S replaces the 
string piP 2 ---Pn with tt(pi)tt(p 2 ) • • • 7r(p n ). This can be written as S{p\P2 . ■ .p n ) = 
7r(pi)7r(p 2 ) • .. 7 r (p„). 

An affine cipher replaces the plaintext character x (represented as an integer) by the 
ciphertext character (ax + b) mod 26, where a and b are integers with a relatively prime 
to 26. When a = 1, an affine cipher is called a shift cipher since each letter is shifted 
a fixed number of positions, with wrap around, in the alphabet. The shift cipher where 
each character is shifted three positions, that is, where x is mapped to (a: + 3) mod 26, 
is known as the Caesar cipher. 

A simple transposition cipher T (with fixed period d) divides plaintext into d- 
character blocks and rearranges these characters by a permutation 7 r on the numbers 
1 , 2 , . . . , d. This can be written as T{pip 2 ...p n )= Pk(i)Pk( 2) ■ ■ ■ Pn(d)Pd+n( 1) 

A full Vigenere cipher V of period d consists of d simple substitutions defined by 
permutations ttq, .... 11,1-1 used in sequence: V(pi) = 7Tj mo d d(Pi) ■ Ciphers such as 
the full Vigenere are called polyalphabetic - — d different alphabetic substitutions are 
used. 
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The simple Vigenere cipher of period d restricts each permutation m to a sim- 
ple shift, so that the key may be represented by a d-letter sequence ko , . . ■ , kd- 1 ; en- 
cryption then consists of simply adding key characters to plaintext: V(pi) = ( pi + 
ki mod d) mod 26. Such ciphers are also called periodic substitution ciphers. 

A Hill cipher is a cipher with an m x m matrix K as its key that encrypts plaintext by 
splitting it into blocks of size m and mapping the plaintext block x = (x\,x^, ■ ■ ■ , x m ) 
to the ciphertext block (j/i, 1 / 2 , • • • , y m ) = xK. 

A homophonic substitution is a cipher in which each plaintext character x in the 
source language is associated with a set S x of ciphertext characters, and each time x 
is to be encrypted, one element of S x is randomly chosen and where the sets S x are 
pairwise disjoint. The cardinality of S x is chosen to be proportional to the frequency 
of x in the source language, to flatten out frequency distributions. 


Facts: 

1. Simple transposition and substitution ciphers, and related symmetric-key ciphers, 
are often called classical ciphers since they were designed and used in ancient times. 

2 . The permutation that serves as key of a substitution cipher can often be represented 
more compactly than by specifying the permutation in full. 

3 . The size of the full key space of substitution ciphers provides an upper bound on 
security, but is often a poor indication. For example, for a simple substitution, there 
are 26! possible keys providing key entropy of 88 bits, or approximately 2 88 keys to 
search through if one resorts to exhaustive cryptanalysis. 

4 . The unicity distance of a simple substitution can be estimated to be 28 characters; 
all simple substitutions applied to English messages can be trivially cryptanalyzed given 
about this many characters. 

5 . Periodic substitution ciphers may be cryptanalyzed by first deducing the period of 
the cipher (by one of several known techniques, for example, the index-of-coincidence 
introduced by W. Friedman, c. 1920, or the Kasiski method ), and then solving d simple 
substitution ciphers. (See [St95] for details.) 

6. The simple Vigenere cipher of period d = 1 is a shift cipher. 

7. Decryption of the affine cipher with encryption function e(x) = (ax + b ) mod 26 
is carried out using the decryption function d(y) = a(y — b) mod 26 where a is an 
inverse of a modulo 26 and y is the ciphertext character associated with the plaintext 
character x. 

8. Decryption of the Hill cipher with encryption function e(x) = xK where x = 
(xi,X 2 , ■ • ■ , x m ) and K is an m x m matrix is carried out using the decryption function 
d(y) = yK~ x where A' -1 is the inverse of K modulo 26 and y is the ciphertext block 
(yi, 2 / 2 , —,ym) associated with the plaintext block x. Note for K~ l to exist, it must be 
the case that gcd(det AT, 26) = 1. 

9 . The ideas of simple transposition and substitution can be combined and compounded 
in countless ways. While not secure individually, they can be combined to construct the 
powerful class of product ciphers which include DES (§14.8.3). 

10 . Data expansion is inherent in homophonic substitutions as the ciphertext character 
set must be larger than the plaintext set. 

11 . Codes differ from ciphers in that codes employ a codebook or dictionary which 
specifies words or phrases that are used to substitute for plain text words or phrases 
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requiring encryption. Decryption is accomplished by a reverse-codebook, indexed by the 
entries (rather than index terms) of the encryption codebook. The use of such codes is 
not easily automated, and they no longer see much use. 


Examples: 

1. The Caesar cipher (shifting every letter by three modulo 26) sends the plaintext 
message ZERO to the ciphertext message CHUR. 


2. The affine cipher with encryption function e(x) = (7x + 10) mod 26 sends the 
plaintext message PLEASESENDMONEY to the ciphertext message LJMKGMGM- 
FQEXMW. For example, the letter P, which corresponds to the number 15, is sent to 
e(15) = (7-15-1- 10) = 11 mod 26, which corresponds to the letter L. Decryption is 
done using the function d(y) = (15 y + 6) mod 26. 


3. The Vigenere cipher of period 4 with key (1,0,13,3) sends the plaintext message 
RENAISSANCE to the ciphertext message SEADJSFDOCR. 


4. The Hill cipher that has 


11 3 
8 7 


as its key encrypts the plaintext JULY (which 


corresponds to the string 9 20 11 24) to the string DELW (which corresponds to the 

( 7 23' 

string 3 4 11 22). Decryption is carried out using the matrix 


inverse of the encryption matrix. 


18 11 


, which is an 


14.8.3 BLOCK CIPHERS 
Definitions: 

A block cipher derives its name from the property that it processes the plaintext 
stream after grouping it into pieces or blocks consisting of a fixed number of characters, 
thereafter operating on the block as a whole. Each block of plaintext is enciphered 
independently of preceding and succeeding plaintext input. 

The U. S. Data Encryption Standard ( DES ) is a block cipher widely used in com- 
merical applications. It has been adopted as a standard by the United States govern- 
ment . 

The mode of operation of an n-bit block cipher describes how the cipher processes 
messages with more than n bits. 

In Electronic Codebook ( ECB ) mode the plaintext message m is split into n-bit 
blocks to = TO 1 TO 2 . . . To/. Each message block is encrypted independently using the 
same secret key k: 

Ci = E k (mi), 1 <i<l. 

In Cipher Block Chaining ( CBC ) mode each ciphertext block is dependent on all 
previous plaintext blocks. Encryption is performed as follows, given an n-bit initializa- 
tion vector IV: 

ci = Efc(mi®/V); c, = E k {mi ® Cj-i), 2 <i<l. 

Decryption is performed as follows: 

To! = D k (ci) © IV\ m.i = D k {ci)®Ci- 1 , 2 < i < l. 
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In Cipher Feedback ( CFB ) mode the plaintext message is split into t-bit blocks 
in = TO 1 TO 2 . . . mi, where 1 < t < n. An n-bit shift register is initialized to the value 
so = IV. Encryption is then performed as follows: 

Cj = mj © MSB t (£’ fc (s i _i)), 1 < i < l, 

where MSB t (z;) denotes the t most significant bits of v, and where s, is obtained 
from Si- 1 by shifting the contents of the register t positions to the left and moving Ci 
into the rightmost t positions of the register. 

In Output Feedback ( OFB ) mode: the plaintext message is split into t-bit blocks 
to = ?ni?7i2 . . . to;, where 1 < t < n. Encryption is performed as follows: 

C; = TOj © MSB t (s,;_i), 1 < 1 < l, 

where so = Ek(IV) and s* = Ek(si-i) for 1 < i < l — 1 and MSB has the same meaning 
as in the definition of CFB mode. 

Facts: 

1. Transposition ciphers are examples of block ciphers. 

2. DES is the most widely used block cipher. 

3. DES was published as a U. S. Federal Information Processing Standard in 1977. It 
resulted from an IBM submission to a 1974 request by the U. S. National Bureau of 
Standards (NBS) (which has now become NIST) soliciting encryption algorithms for 
the protection of computer data. 

4. DES processes plaintext blocks of n = 64 bits, producing ciphertext blocks of 64 
bits. 

5. The encryption mapping E is parametrized by a secret 56-bit key k. Since decryp- 
tion requires that the mapping be invertible, Ek is a bijection. 

6. The total number of distinct permutations on an n-bit space is (2")! ; DES imple- 
ments only a tiny fraction of these — at most 2 , corresponding to the number of 
distinct DES keys. 

7. DES, and in fact all block ciphers, can be viewed as large substitution ciphers. 
For a fixed key k, each 64-bit plaintext “character” is substituted by a fixed 64-bit 
ciphertext “character”. The same techniques that make simple substitution ciphers 
trivial to cryptanalyze do not directly threaten the security of DES or similar ciphers, 
however, due to the large block size. 

8. Encryption of each 64-bit block proceeds in sixteen stages or rounds. 

9. The 56-bit key k is used to create sixteen 48-bit subkeys k t , one for each round. 

10. Within each round, eight fixed, carefully selected 6-to-4 bit substitution mappings 
(S-boxes) Si, collectively denoted S, are used. 

11. The initial 64-bit plaintext is divided into two 32-bit halves, Lq and Rq. 

12. Each round is functionally equivalent, taking 32-bit inputs i;_i and Ri-i from the 
previous round and producing outputs A; and R , for 1 < i < 16, as follows: 

Li = Ri- 1; Ri = Li-! © F(Ri-!, A";); A(A 4 _i, K t ) = P(S'(A(A i _i © AT,))). 

Here A is a fixed expansion permutation mapping 32 bits to 48 bits (all bits are used 
once, some are taken twice), and P is another fixed permutation on 32 bits. 
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13. An initial permutation (IP) precedes the first round, and its inverse is applied 
following the last round. 

14. Decryption makes use of the same algorithm and the same key, except that subkeys 
are applied to the internal rounds in the reverse order. 

15. Each round involves both (bitwise) substitution and transposition. 

16. A complete description of the U. S. Data Encryption Standard (DES) algorithm can 
be found in the U. S. Federal Information Processing Standards Publication 46 (FIPS 
46: Data Encryption Standard). 

17. Given a plaintext-ciphertext pair, exhaustive cryptanalysis of DES is possible - 
all 2 56 keys could be checked to determine which maps the given plaintext to the given 
ciphertext. With current technologies, such an attack is now feasible in practice for 
strongly-motivated adversaries. See [Gi98] for example. 

18. Results of experiments indicate that all bits of the ciphertext depend on all bits 
of the key and all bits of the plaintext; changing any single bit of the plaintext or key 
causes each ciphertext bit to change with probability about 0.5. 

19. There are several ways in which a block cipher can be employed. Let E be a block 
cipher parametrized by a key k. Suppose that E processes plaintext blocks of n bits, 
producing ciphertext blocks of n bits. The initial value IV is a randomly chosen n-bit 
block known to the sender and receiver; IV may be exchanged in the clear (except in 
output feedback mode (OFB)) or it may be transmitted by the sender to the receiver 
by encrypting it in ECB mode. 

20. The weakness of ECB mode is that two identical plaintext blocks are always en- 
crypted to the same ciphertext block. An advantage of ECB mode is that transmission 
errors are not propagated from block to block. 

21. If a different IV is selected for each message encrypted in CBC mode, then two 
identical plaintexts will, in general, be encrypted to different ciphertexts. If the integrity 
of the IV is not protected, then an opponent can selectively manipulate the bits of the 
first message block by manipulating the bits of the IV. This situation may be avoided 
by encrypting the IV. 

22. In CBC mode there is no propagation of transmission errors since a message 
block to, depends on only two ciphertext blocks, c,_i and Cj. 

23. Decryption in CFB mode is performed by initializing the shift register to the value 
So = IV, and then computing 

m,i = d © MSB t (£’fc(sj_i)), 1 < i < l. 

24. In CFB mode a transmission error may affect several message blocks. Note that 
the block cipher E is operated in encryption mode at both the sending and receiving 
ends. 

25. In OFB mode decryption is performed by computing 

to, = Ci ® MSB t (s,_i), 1 < i < l. 

26. In OFB there is no error propagation. A single bit error in the ciphertext causes 
a single bit error in the recovered plaintext. As with CBC mode, the block cipher E is 
operated in encryption mode at both the sending and receiving ends. 
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14.8.4 STREAM CIPHERS 


Definitions: 

A stream cipher is a symmetric-key cipher that encrypts individual characters of a 
plaintext message, or small units. (In contrast, a block cipher tends to encrypt groups 
of characters, or larger units.) 

A synchronous stream cipher is a stream cipher in which the keystream is generated 
independently of the message. 

A self-synchronizing stream cipher is a stream cipher capable of reestablishing 
proper decryption automatically after loss of synchronization, with only a fixed number 
of plaintext characters unrecoverable. 

The one-time pad is a stream cipher with the following encryption function: each bit 
of plaintext is XORed to the next bit of a truly random key, which is never reused for 
encryption and is of bit length equal to that of the plaintext. Decryption is accomplished 
by applying the same process, with the same key, to the ciphertext string. 

Facts: 

1. Stream ciphers are more appropriate, and in some cases mandatory (for example, in 
some telecommunications applications), when buffering is limited and characters must 
be individually processed as they are received. 

2. A stream cipher typically consists of a generator which produces a pseudorandom 
bit sequence (the key) which is then XORed (added modulo 2) with the plaintext bits. 

3. In a synchronous stream cipher, both the sender and receiver must be synchronized 
using the same key and operating at the same position (state) within that key 

- in order for proper decryption. If synchronism is lost then decryption fails and 
can only be restored through additional techniques for resynchronization (for example, 
reinitialization, or the receiver trying possible offsets). 

4 . The OFB mode of a block cipher is an example of a synchronous stream cipher. 

5 . The CFB mode of a block cipher is an example of a self-synchronizing stream cipher. 

6. For such ciphers, self-synchronization is possible because the encryption/decryption 
mappings depend only on a fixed number of preceding ciphertext characters. 

7 . The one-time pad is the most well-known stream cipher. It is also referred to as the 
Vernam cipher , originating from work of G. Vernam in 1917. 

8. The one-time pad offers unconditional security, at the price of a key of length equal 
to that of the plaintext, which can be used only once. It is an example of a synchronous 
stream cipher. 

9 . An extensive study of stream ciphers can be found in [Ru86]. 


14.8.5 KEY DISTRIBUTION PROBLEM 
Definitions: 

The problem of producing secure keys that can be used by each of a group of users to 
be able to communicate in secret with every other user is called the key distribution 
problem. 

A key distribution center ( KDC ) is a trusted third party that distributes short-term 
secret keys for secure communications from a particular party to another. 
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Facts: 


1. The security of all cryptographic mechanisms depends on the secrecy and/or au- 
thenticity of keying material. 

2 . Consider a system that uses only symmetric cryptographic algorithms such as the 
U. S.Data Encryption Standard (DES), with n users. If each pair of users is to be 
able to communicate privately, then each user must acquire and maintain n— 1 secrets 
keys, one for each other party; overall, this requires n< ' n . 2 ^ keys in the system. These 
keys must be distributed by secure means, for example by each pair meeting in person 
or by trusted couriers, prior to the commencement of a secure communication. Such 
distribution is typically both inconvenient and costly, and increasingly unmanageable 
as n grows. 

3 . A solution to the key distribution problem is obtained by using public-key cryptog- 
raphy (§14.9.1). 

4 . Another solution of the key distribution problem is to make use of a trusted third 
party T, as follows. Each party A shares a unique long-term secret key Kat with 
T . Any party A may acquire a short-term secret key or session key to communicate 
securely with any other party B , using the third party T as a key distribution center in 
the following way: 

• using Kat to establish a secure channel with T, A requests from T a new random 

secret key to use with B\ 

• T creates such a key Kab, transfers one copy of it to A using the secure channel 

facilitated by Kat, and makes another copy of it available either directly to B 
using the secure channel facilitated by Kbt, or sends a copy of Kab encrypted 
under Kbt to A over the secure channel to A; A then transfers this encrypted 
key to B. 

5 . The Kerberos protocol , originating from Project Athena at M.I.T. in 1987 and 
based on a 1978 protocol of R. Needham and M. Schroeder, is a particular example of 
an authenticated key distribution protocol based on symmetric cryptographic techniques 
and the use of a KDC. 

6 . An alternative to a KDC is to use a trusted third party as a key translation center 
( KTC ); in this case, party A itself creates the key I<ab intended for use with a party B, 
transfers it securely to the trusted party under the channel secured by Kat, and relies 
on the trusted party to decrypt the key intended for B , secure it specifically for B by 
reencrypting it under the key Kbt, and then make this encrypted version available to B 
either directly or via A. 

7. The use of both KDCs and KTCs for cryptographic key establishment was popu- 
larized within the U. S. financial community by the ANSI X9.17 standard. 

8. The use of a KDC or KTC to solve the key distribution problem can be pictured 
as a spoked-wheel, with each user on the perimeter at the end of a spoke, and the 
trusted party at the center. A secure channel between any two users A and B on 
the perimeter can be established by using the secure channels provided by the spokes 
(keys Kat and Kbt), set up during system initialization, to establish a secure channel 
directly (by key Kab)- Each party now needs initially to acquire only a single secret 
key (corresponding to a single spoke), rather than n— 1 keys as before. 
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1 4.9 PUBLIC-KEY SYSTEMS 

The invention of public- key cryptosystems in the mid-1970s has had a profound impact 
on cryptography. In a public-key cryptosystem, there are public encryption keys that 
can be publicly shared and secret decryption keys that cannot be found, using a practical 
amount of computation, from the public encryption keys. 

This fundamental difference between public-key cryptosystems and symmetric-key 
cryptosystems, where knowledge of an encryption key brings knowledge of the corre- 
sponding decryption key, makes public-key cryptography useful for many different types 
of practical applications. 


14.9.1 BASIC INFORMATION 

Definitions: 

Public-key cryptography is the study of codes where each party has a pair of keys 
(a private key and a public key) and only needs to keep the private key secret. 

Facts: 

1. Public-key cryptography was first proposed by W. Diffie and M. Heilman in 1976 
[DiHe76], 

2. The fundamental concept is that the public key, which can be made known to 
everyone, allows anyone to encrypt messages for A, but the decryption of these messages 
can be carried out only with knowledge of the corresponding private key, which only 
party A knows. 

3. A necessary condition for a public- key cryptosystem to be secure is that it be infea- 
sible to derive a private key from the corresponding public key. 

4. Three advantages offered by public-key systems over private-key systems are the 
following: 

• They provide a solution to the key distribution problem without using a KDC or 

a KTC. Each party requires only one public key and one private key in order to 
communicate securely with all other parties, as opposed to a separate private 
key to communicate with each. A new prerequisite however is that authentic 
copies of other parties’ public keys be available by some means. 

• They provide an elegant solution to the problem of digital signatures. 

• They allow public key distribution systems, whereby, surprisingly, secret symmet- 

ric keys can be derived jointly by two remote parties through communications 
over unsecured public channels. 

5. Diffie and Heilman proposed solutions to the public key distribution problem, and 
shortly after they conceived the notion, practical instantiations of both public-key en- 
cryption and public-key signature systems were proposed: knapsack encryption schemes 
and both RSA signature and encryptions schemes. 
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6. Many different public-key cryptosystems have been proposed. Among the most 
important of these are: 

• the RSA cryptosystem (security based on the difficulty of factoring large integers 

and the problem of finding eth roots modulo a composite integer where e is an 
integer); 

• the Rabin cryptosystem (security based on the difficulty of factoring large integers 

and the problem of finding square roots modulo a composite integer); 

• the El Gamal cryptosystem (security based on the difficulty of finding discrete 

logarithms); 

• the McEliece cryptosystem (security based on the difficulty of decoding certain 

linear codes); 

• the Merkle-Hellma.n cryptosystem (security based on the difficulty of the subset 

sum problem); 

• the elliptic curve cryptosystem (security based on the theory of elliptic curves); 

see the Certicom ECC Tutorials and Whitepapers page at the website: 
http : //www. certicom. com/ecc/ index . htm 

• the NTRU cryptosystem (security based on the difficulty of lattice problems); see 

the NTRU Public Key Cryptosystem Overview at the website: 
http : //www . ntru . com/tutorials/techsummary . htm 

7. The current status of public-key cryptosystems is that they offer many advantages 
for key establishment and digital signatures, but so far have generally been too compu- 
tationally expensive for bulk encryption. 

8. Typically, in practice symmetric ciphers like DES continue to be used for encryp- 
tion, public-key systems like RSA are used for digital signatures, and a combination of 
symmetric and public-key techniques are used for key establishment. 


14.9.2 KNAPSACK ENCRYPTION SCHEME 
Definitions: 

Given a set of positive integers {ai, 02 , • ■ ■ , a n } and a specified sum s, the knapsack 
problem is the problem of finding a 0-1 vector (aq, £ 2 , ■ ■ ■ , x n ) such that ]T)" =1 cqaq = s, 
or determining that such a vector does not exist. 

A super-increasing sequence is a set {< 21 , 02 , . . . ,a n } of positive integers with the 
property that a* > Y^j = 1 a j f° r each i = 2, . . . , n. 

The knapsack cryptosystem is a cryptosystem in which encryption is carried out 
using a super-increasing sequence of integers. 

Facts: 

1. The knapsack encryption scheme, due to R. Merkle and M. Heilman, was the first 
concrete realization of a public- key encryption scheme. Its security is based on the 
knapsack problem. 

2. Although the knapsack problem, also known as the subset sum problem, is known 
to be NP-hard, there are special instances of the problem which are easy to solve. 

3. For a super-increasing sequence the knapsack problem is very easy to solve. The 
Merkle-Hellman knapsack encryption scheme disguises a super-increasing sequence by 
modular multiplication and a permutation. 
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4. Key generation for the knapsack encryption scheme can be carried out as follows. 
An integer n is fixed. Each user does the following: 

• choose a super- increasing sequence of positive integers Oi, 02 , . . . , a n , and choose 

a modulus M with M > cq + cq + • • • + a n ; 

• choose an integer W, 1 < W < M — 1, with gcd (W, M) = 1; 

• choose a random permutation tt of the integers {1,2 ,..., n}; 

• compute bi = Wa mod M , for * = 1, 2, . . . , n; 

• publish the public key (61 , 62 , - - - , b n ); the private key is (n, M, W, ai, 02 , . . . , a n ). 

5. The basic Merkle-Hellman knapsack encryption scheme operates as follows, where 
person B sends a message to person A: 

• encryption: B does the following: 

o look up A’s public key (61, b 2 , ■ . . , b n ); 

o represent the message m as a binary string of length n, m = mi m2 . . . m n ; 

if the message is too big, break it into blocks; 
o compute c = mi 61 + m2 62 + • • • + m n b n \ 
o send do A. 

• decryption: A does the following: 

o compute d = W~ x c mod M; 

o solve the super-increasing knapsack by finding integers r\, r ^, . . . , r n , ri € 
{0, 1}, such that d = yqai + r2<22 + • • • + r„a„; 
o conclude that the message bits are to, = r^u), i = 1,2 , . . . ,n. 

6. The Merkle-Hellman scheme, as well as sundry variations of it, have all been shown 
to be insecure. Essentially, this is because the underlying easy knapsack can be recovered 
from the public knapsack with minimal effort. 

Examples: 

1. The sequence 1, 2, 5, 10, 20, 40 is super-increasing, but 1, 2, 6, 10, 18, 30 is not. 

2. The solution to the super-increasing subset problem with super-increasing sequence 
1, 2, 5, 10, 20, 40 and subset sum 27 is 27 = 20 + 5 + 2. 


14.9.3 RSA CRYPTOSYSTEM 


Definition: 

The RSA cryptosystem is a public-key cryptosystem that encrypts messages using 
modular exponentiation, where the modulus is the product of two very large primes. 

Facts: 

1. The RSA cryptosystem was invented by R. Rivest, A. Shamir, and L.Adleman in 
1978, and is the most widely used public-key cryptosystem today. 

2. The RSA cryptosystem supports both secrecy and digital signatures, and its security 
is based on the difficulty of factoring integers. 

3. Keys are generated in RSA when each user does the following: 

• pick two large primes p and q , each roughly the same size. Compute n = pq and 

cj)(n) = {p- 1)(<7 — 1); 

• select a random integer e, 1 < e < such that gcd(e, <j>(ri)) = 1; 

• using the extended Euclidean algorithm (§4.2.2), compute the unique integer d, 

1 < d < 0(n), such that ed = 1 (mod 4>{n))\ 

• publish the public key (n, e) ; the private key is d. 
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4. The RSA system is used as follows for B to send a message to A: 

• encryption: B does the following: 

o look up A’s public key (n, e); 

o represent the message as an integer m in the interval [0, n — 1]; if the message 
is too big, break it into blocks; 
o compute c = m e mod n; 
o send c to A. 

• decryption: A does the following: 

o use the private key d to recover m = c d mod n. 

5. The RSA system can be used to send signed messages as follows, where A signs a 
message for B: 

• signature generation: A does the following: 

o represent the message m as an integer in the interval [0, n — 1]; if the message 
is too big, break it into blocks; 
o use his/her private key d to compute s = m d mod n; 
o send the signature s to B (the message m will be recovered from s itself). 

• signature verification: B does the following: 

o look up A’s public key ( n , e); 

o recover the message m = s e mod n; 

o accept A’s signature, provided that m is “meaningful”. 

6. In practice, one does not select an exponent e at random, but instead chooses some 
small value such as 3, 17, or 2 16 + 1. 

7. One technique for rendering a message meaningful is to add some prearranged re- 
dundancy to the message, for example by requiring that rn begin with a predetermined 
64-bit pattern. Another is to use a suitable hash function (§14.9.2) before signing, even 
when the message m is short enough to fit in a single block. 

8. A common technique used to avoid having to sign each block of a long message m is 
to first compute m* = H(m), where H is a public one-way hash function that outputs 
integers in the interval [0,n— 1], and then send the message m along with the signature 
of the hash value, s = ( m*) d mod n to B. Person B can verify the signature by 
computing s e mod n and H(m .), and checking that these two quantities are the same. 

9. Breaking the RSA encryption or signature schemes is widely believed to be as dif- 
ficult as factoring the modulus n in these schemes, although such an equivalence has 
never been proven. 

10. Given the latest progress on the factorization of large integers, a 512-bit modulus n 
will provide only marginal security from concerted attack; as of 1999, a modulus n of 
at least 768 bits is recommended. 

1 1 . More information about the RSA cryptosystem can be obtained on the Internet at 
the RSA Laboratories site 

www . r sa . com/ rsalabs 

Example: 

1. Take the modulus in the public key encryption scheme to be 2537 = 43-59 and 
the exponent to be 13. The plaintext message PUBLICKEYCRYPTOGRAPHYX, 
corresponding to 1520 0111 0802 1004 2402 1724 1519 1406 1700 1507 2423 when letters 
are replaced with the corresponding integers in the set {0, 1, . . . , 25}, is mapped to the 
ciphertext message 0095 1648 1410 1299 0811 2333 2132 0370 1185 1457 1084. For 
example, the first block 1520 is mapped to 0095 since 1520 13 mod 2537 = 95. 
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14.9.4 EL GAMAL CRYPTOSYSTEM 


Definition: 

The El Gamal cryptosystem is a public-key cryptosystem based on the discrete log- 
arithm problem (see Chapter 4). 

Facts: 

1. The El Gamal cryptosystem was proposed in 1985 by T. El Gamal. 

2. The El Gamal cryptosystem supports both secrecy and digital signatures and its 
security is based on the difficulty of the discrete logarithm problem. 

3. Keys for the El Gamal cryptosystem are generated when each user does the following: 

• pick a large prime p and a generator a of the multiplicative group Z* of the 

integers modulo p; 

• select a random integer a, 1 < a < p— 1, and compute a a mod p; 

• publish the public key (p, a,a a ); the private key is a. 

4. The El Gamal encryption scheme works as follows where B sends a message to A: 

• encryption: B does the following: 

o look up ^4’s public key (p, a, a“); 

o represent the message as an integer m in the interval [0,p— 1]; if the message 
is too big, break it into blocks; 
o select a random integer k, 1 < k < p— 2; 
o compute a k modp and m • ( ot a ) k mod p 
o send (a k ,ma ak ) to A ; 

• decryption: A does the following: 

o use the private key a to compute a ak = ( a k ) a mod p and then compute 
a~ ak mod p; 

o recover m by computing ( a~ ak )(ma ak ) mod p. 

5. The El Gamal signature scheme operates as follows where A signs a message for B: 

• signature generation: to sign a message m of arbitrary length, A does the 

following: 

o select a random integer k, 1 < k < p— 2, with gcd (k,p — 1) = 1; 
o use the extended Euclidean algorithm to compute an integer I, 1 < I < p— 2, 
such that kl = 1 (modp— 1); 
o compute r = a k mod p; 

o compute H(m ), the hash of m, using a one-way hash function H\ 
o compute s = I ■ ( H(m ) — ra ) mod (p— 1) 
o send the signature (r, s ) along with the message m to B. 

•signature verification : B does the following: 
o look up ^4’s public key (p, a, a“); 
o compute H(m)\ 

o compute u\ = ( a a ) r ■ ( r s ) mod p; 
o compute ui = a H ^ m ' mod p; 
o accept the signature only if iii = U 2 - 

6. Breaking the El Gamal encryption or signature scheme is widely believed to be as 
difficult as computing logarithms in Z *, although such an equivalence has never been 
proven. 
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7 . Given the latest progress on the discrete logarithm problem, a 512-bit modulus p 
will provide only marginal security from concerted attack; as of 1999, a modulus p of 
at least 768 bits is recommended. 

8. The parameters p and a can be common to a group of users, in which case user A's 
public key is just a a . 

9 . The El Gamal cryptosystem can be generalized to work in any finite cyclic group G 
instead of the multiplicative group Z*. Groups that have been proposed for this purpose 
for reasons of practical efficiencies include the following: the multiplicative group F£ m 
of finite fields of characteristic two and the group of points on an elliptic curve over a 
finite field. 


14.9.5 MC ELIECE ENCRYPTION SCHEME 
Definition: 

The McEliece encryption scheme is the encryption method that is the foundation 
of a public-key cryptosystem based on linear codes from the theory of error-correcting 
codes. 

Facts: 

1. R. McEliece introduced the McEliece encryption scheme in 1978 as the basis of a 
public-key cryptosystem. 

2. The security of the McEliece encryption scheme is based on the fact that the general 
decoding problem for linear codes is NP-hard. 

3. Keys are generated in the McEliece encryption scheme in the following way: Inte- 
gers k and n are first fixed. Each user does the following: 

• choose a k x n generator matrix G for a binary (n, k )- code that can correct t 

errors, and for which there is an efficient decoding algorithm; 

• choose a random k x k binary nonsingular matrix S; 

• choose a random n x n permutation matrix P; 

• compute the k x n matrix G = SGP; 

• publish the public key (G, t); the private key is ( S,G,P ). 

4. A party B sends a message to a party A in the McEliece encryption scheme as 
follows: 

• B does the following to encrypt the message: 

o look up A’s public key (G, f); 

o represent the message in as a binary string of length k\ if the message is too 
big, break it into blocks; 

o choose a random error vector 2 of length n and Hamming weight < t\ 
o compute c = mG + z\ 
o send c to A. 

• A does the following to decrypt the received message: 

o compute c = cP , 

o use the decoding algorithm for the code generated by G to decode c to m; 
o compute m = fhS _1 . 
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5. McEliece suggested using a special type of error-correcting code called a Goppa 
code with parameters n = 1024, k = 524, t = 50, in step 1 of the key generation 
procedure. (For each irreducible polynomial g(x) of degree t over the finite field of 2 m 
elements, there exists a binary Goppa code of length n = 2 m and dimension k > n — mt 
capable of correcting any pattern of t or fewer errors. Furthermore, efficient decoding 
algorithms are known for Goppa codes. For further information, see [MaS177].) With 
these parameters, the McEliece encryption scheme is believed to form the basis of a 
secure public-key cryptosystem. 

6. Two disadvantages of the scheme are the large size of public keys and the message 
expansion. 


14.9.6 DIGITAL SIGNATURE ALGORITHM 

The digital signature algorithm (DSA) was adopted in 1994 as a signature standard by 
the U. S. Government. Its security is based on the difficulty of the discrete logarithm 
problem in a large subgroup of the multiplicative group Z*. The scheme can be viewed 
as a variant of the ElGamal signature scheme. 

Facts: 

1. To generate keys, each user does the following: 

• pick a prime p such that p — 1 has a prime factor q , where 2 159 < q < 2 160 ; 

p— 1 

• select a random integer h, 1 < h < p—1, and such that h « mod p > 1; let 

p— 1 

g = h i mod p ; 

• select a random integer x, 0 < x < q, and compute y = g x mod p\ 

• the user’s public key is (p,q,g,y); the user’s private key is x. 

2. The following steps make up a digital signature algorithm where A signs a message 
that is to be sent to B: 

•signature generation: to sign a message m of arbitrary length, A does the follow- 
ing: 

o choose a random integer k, 0 < k < q\ 
o compute fc -1 mod q\ 
o compute r = ( g k mod p) mod q\ 

o compute H{m), the hash of the message, using a one-way hash function 7J(); 
o compute s = k~ x ■ ( H(m ) + xr) mod q- 
o send the signature (r, s) along with message m to B. 

• signature verification: B does the following: 

o look up .A’s public key (p, q , g , y): 
o compute w = s _1 mod q ; 
o compute H(m): 

o compute u\ = H(m)w mod q and U 2 — rw mod q ; 
o compute v = (( g Ul y u 2 ) mod p) mod q\ 
o accept the signature only if v = r. 

3. The U. S. Government standard specifies that the prime p must be between 512 and 
1024 bits in length, however it is generally recommended that p be at least 768 bits in 
length. 
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4. The parameters p , q , and g can be common to a group of users, in which case the 
public key is just y. 

5. The specific hash algorithm specified for use within the DSA standard is the se- 
cure hash algorithm (SHA-1) as specified in the U. S. Federal Information Processing 
Standards Publication 180-1 (FIPS 180-1: Secure Hash Standard). 


14.9.7 FIAT-SHAMIR SIGNATURE SCHEME 
Definition: 

The Fiat-Shamir signature scheme is a signature scheme based on the difficulty of 
extracting square roots modulo a composite number n. 

Facts: 

1. The Fiat-Shamir signature scheme was introduced in 1986. 

2. The security of the Fiat-Shamir signature scheme is based on the difficulty of ex- 
tracting square roots modulo a composite number n, a problem that is equivalent in 
difficulty to the problem of factoring n. 

3. Key generation is done using the Fiat-Shamir scheme as follows. Integers k and t 
are fixed; each user does the following: 

• pick two primes p and q , and compute n = pq\ 

• select k random integers s\, S 2 , ■ ■ ■ , Sk in the interval [1, n— 1], such that for each 

i, gcd (s it n) = 1; 

• compute Vi = s ~ 2 mod n, for 1 < i < k. 

• the user’s public key is (vi,V 2 , • ■ • , Vk, n); the user’s private key is (si, S 2 , ■ ■ ■ , Sfc). 

4. The Fiat-Shamir signature scheme operates as follows, where A signs a message 
for B: 

•signature generation: to sign a message m of arbitrary length, A does the follow- 
ing: 

o choose random integers rq, r 2 , . . . , rq in the range [0, n— 1], and compute 
Xi = r\ mod n for each i, 1 < i < t; 

o compute H(m, Xi,X 2 , • ■ • , aq), where H is a one-way hash function, and use 
its first kt bits as entries e i3 of a t x k binary matrix E\ 
o compute yi = ('r'i Sj) mod n for i = 1, 2, . . . , f; 

o send the signature (yi, 1 / 2 , • • • , Vt, E) along with message mtoB. 

•signature verification: B does the following: 
o look up A’s public key (iq, V 2 , ■ ■ ■ , Vk, n)-, 
o compute Zi = (yf Y\ ei =i v j) m °d n for * = 1, 2, . . . , t; 
o compute h = H(m , z\, Z 2 , ■ ■ ■ , zi); 

o accept the signature only if the first kt bits of h are the same as the entries 
Cij of E. 

5. The Fiat-Shamir scheme is provably secure, provided that factoring is difficult and H 
is a truly random function. 

6. The modulus n should be large enough to withstand the best algorithms known for 
factoring integers; as of 1999, a size of at least 768 bits is recommended. 

7. To avoid forgeries, the parameters k and t should be chosen so that kt is at least 72. 
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14.9.8 KEY DISTRIBUTION 
Definitions: 

Entity ( identity ) authentication is some form of positive corroboration of the iden- 
tity of another party in a protocol. 

A key establishment mechanism provides explicit key authentication (key confirma- 
tion) if one party receives some indication that a second party whose identity has been 
corroborated actually knows the secret key established; it provides (only) implicit key 
authentication if no such indication is received, but nonetheless the identified second 
party is the only other party who could feasibly derive that key. 

Facts: 

1. The solution to the key distribution problem (§14.8.5) by symmetric techniques and 
a trusted third party has several disadvantages, two of which are the requirement and 
involvement of an on-line trusted third party, and that compromise of long-term keys 
shared between a user and the trusted party will compromise all other keys established 
for that user based on that key. 

2. If public-key cryptographic techniques are used, even in the case where each party 
computes its own private keys of (public, private) key pairs, two types of keys generally 
need to be distributed between parties: public keys (for use in algorithms such as RSA), 
and symmetric keys (for use in symmetric algorithms such as DES). 

3. Public keys can be delivered in person, but this is costly; other appropriate means 
are generally used, such as public-key certificates (§14.9.9). 

4. Key establishment mechanisms are generally used to make a symmetric key secretly 
available to two authorized parties for subsequent cryptographic use. 

5. Key establishment mechanisms may be divided into key transfer mechanisms, in 
which a key created by one party is securely transmitted to another; and key agreement 
mechanisms, whereby two parties jointly establish a shared secret key which is a function 
of information contributed by each. 

6. Two basic requirements in key establishment are the secrecy of the established key, 
and that each party learn the true identity of the other party sharing the key. 

7. Authentication may be either unilateral or mutual. 

8. The number of passes refers to the number of messages exchanged between the 
parties. 

9. In 1976, W. Diffie and M. Heilman provided the first practical solution to the key 
distribution problem by presenting a key agreement protocol with the following prop- 
erties: two parties who may possibly have never met before nor shared any information 
related to keys are able to establish a shared secret key by exchanging two messages 
over an unsecured public channel. To set up the Difhe-Hellman key agreement where 
two parties establish a secret key over a public channel, carry out the following steps: 

• fix an appropriate prime p and generator a of Z* ; 

• A chooses a random secret £ {1, 2, ... ,p — 2}, and sends a rA mod p to £?; 

• B chooses a random secret £ {1,2 , ,p — 2}, and sends a rB mod p to A; 

• B computes the shared key as k = ( a rA ) rB mod p\ 

• A receives a rB and computes k = ( a rB ) rA mod p. 

10. As of 1999, it is recommended that p be at least 768 bits in length. 
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11. The basic mechanism provides secrecy against passive intruders but not authenti- 
cation — neither party is assured about the other’s identity, and neither obtains entity 
authentication or implicit key authentication. 

12. ElGamal key agreement (one-pass; B sends to A information allowing key agree- 
ment): ElGamal’s encryption scheme (§14.9.4) is a variation of the Diffie-Hellman key 
agreement protocol, and can be used for a one-pass key transfer protocol with implicit 
key authentication of the intended recipient A to the originator B , as follows (assume 
the same setup as in §14.9.4): 

• B chooses a random integer l, 1 < l < p— 2, and sends A the quantity a 1 mod p: 

• B looks up A’s public key a a and computes for himself the key k= ( a a ) l mod p: 

• A computes the same quantity upon receipt of B' s message, as k = {a l ) a mod p. 

13. The recipient A has no corroboration of whom it shares the secret key with; the 
protocol does not provide entity authentication to either party. 

14. If A independently initiates an analogous protocol simultaneously with B , resulting 
in the key k' , and each party then computes K = kk' mod p, then the combined two- 
pass scheme provides key agreement with mutual implicit key authentication (but still 
provides neither entity authentication nor explicit key authentication). 

15. Both the one-pass and two-pass schemes have the advantage that public keys could 
be exchanged (for example, by including certificates) within the protocol itself without 
additional passes. 

16. The following three-pass variation of the basic Diffie-Hellman protocol allows the 
establishment of a shared secret key between two parties with mutual entity authen- 
tication and mutual explicit key authentication. This technique makes use of digital 
signatures. Set up is the same as in basic Diffie-Hellman key agreement, plus: 

• A has RSA public signature key (e^n©, and private key g©; B has analogous 

keys; 

• RSA signature generation is done using an appropriate one-way hash function H 

prior to exponentiation; A’s signature on m is Sa(tti) = H(m) dA mod ha', 

• A and B have access to authentic copies of the other’s public signature keys. 

17. Diffie-Hellman with explicit authentication (three-pass): 

• A generates a secret random number ca G {1, 2, . . . ,p— 2} and sends to B: 

a rA mod p; 

• B generates a secret random number e {1, 2, . . . ,p— 2}, and computes the 

shared key k = ( a rA ) rB mod p. B signs the concatenation of both exponentials 
and the computed key, and sends to A: a rB , Sb{oA b , a rA , k ); 

• A computes the shared key k = ( a rB ) rA modp, and verifies with B’s public key 

that the message recovered on signature verification of the received message is 
the hash of the following three quantities: the cleartext exponential received, 
the exponential sent in the first message, and the computed key k; 

• if signature verification fails, A terminates with failure; otherwise, A accepts 

that k is actually shared with R, and sends to B the message: Sa{oA a , a rB , k). 

• B analogously verifies A’s signature on the received message; 

• if signature verification fails, B terminates with failure; otherwise B accepts that k 

is actually shared with A. 

18. Inclusion of the key k within the hashed, signed portion of the second and third 
messages provides explicit key authentication. 
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19 . RSA can also be used in a one-pass protocol for key establishment by key transfer. 
The basic protocol consists of A using B’s public encryption key, encrypting a randomly 
generated key k , and sending it to B. This provides B with no authentication regarding 
the source of the key, but can be modified by having the sender RSA-sign the mes- 
sage using its own private signature key, before RSA-encrypting it with the intended 
recipient’s public encryption key. 


14.9.9 PUBLIC-KEY CERTIFICATES 
Definitions: 

A public-key certificate consists of a data part and a signature part. 

The data part consists of the name of an entity, the public key corresponding to that 
entity (for example, RSA public key), and possibly additional relevant information (for 
example, entity’s street or network address, validity period for public key, etc.). 

The signature part consists of the signature of a trusted authority, called a central 
authority or certification authority (CA), over the data part. 

Facts: 

1 . The distribution of public keys is generally easier than that of symmetric keys, since 
secrecy is not required. However, the integrity (authenticity) of public keys is critical. 

2 . In 1979 L. Kohnfelder suggested the idea of using public-key certificates to facilitate 
the distribution of public keys over unsecured channels, such that their authenticity can 
be verified. 

3 . For any party B to verify the authenticity of the public key of any party A, B 
must have an authentic copy of the public (signature verification) key of the CA. (For 
simplicity, assume that the authenticity of this public key is provided to party B by 
non-cryptographic means, for example, by having party B obtain it from the CA in 
person.) 

4 . Given the Fact 3, B can then carry out the following steps: 

• acquire the public-key certificate of A over some unsecured channel, either from 

a central database of certificates, from A directly, or otherwise; 

• using the CA’s public key, verify the CA’s signature on A’s certificate; 

• if this signature verifies correctly, accept the public key in the certificate as A’s 

authentic public key; otherwise, assume the public key is invalid. 

5 . Before creating a public- key certificate for a A, the CA must take appropriate mea- 
sures to verify the identity of A and the fact that the public key to be certified actually 
belongs to that party. 

6. One method might be to require that A appear before the CA with a conventional 
government passport as proof of identity, and obtain A’s public key from A in person. 

7 . Once the CA creates a certificate for a party, the trust that all other entities have 
in the authenticity of the CA’s public key can be used transitively to gain trust in 
the authenticity of that party’s public key, through acquisition and verification of the 
certificate. 
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14.9.10 AUTHENTICATION 


Definitions: 

Two important types of authentication are entity authentication (also known as 
identification) , which authenticates the identity of a party, and message authenti- 
cation , which authenticates the validity of a message. 

Facts: 

1. One authentication method, allowing identification of one party using a 2-pass 
challenge-response protocol based on a shared secret key, is called an IFF scheme ( iden- 
tification , friend or foe). This terminology originates from the original use of this tech- 
nique for identifying aircraft during times of war — the challenger is a military radar 
station, and the challenged entity is an aircraft, either friendly or foreign. 

An entity B (the challenger ), which wishes to be able to identify a second entity A 
(the responder ), distributes a shared secret key to that entity ahead of time. 

• the challenger B sends to an unidentified entity X a time-varying number (the 

challenge ); 

• entity X receives the challenge, and replies with an answer (the response) ex- 

pected by B to be a one-way function of the challenge and the shared secret 
key; 

• if the response is that which was expected from entity A, then the challenger 

accepts X to be A; otherwise X remains unidentified. 

2. Any one-way function of the shared secret key can be used, including encryption 
with a block cipher such as DES, or an appropriate keyed one-way hash function. 

3. The protocol must be modified in environments where the roles of challenger and 
responder can be reversed, otherwise a challenged party, upon being challenged, can 
initiate a new protocol by reflecting the challenge back to the challenger, extracting 
the correct response from that entity, and then using that response to respond to the 
original challenge. 

4. An authentication scheme, proposed by L. Guillou and J.-J. Quisquater in 1988 
known as the GQ scheme , allows identification of one party and is based on public- 
key techniques. It is an optimization, with respect to number of messages and memory 
requirements, of an earlier scheme of A. Fiat and A. Shamir; the Fiat-Shamir Signature 
Scheme discussed in §14.9.7. The GQ Scheme involves three messages between entities A 
and B, where A is the prover (entity whose identity is to be corroborated) and B is 
the verifier (or challenger). It was designed with the specific application in mind where 
the prover is a processor such as a “smart card” (integrated circuit mounted on a credit 
card) with limited processing power and memory. This scheme is set up as follows: 

• a trusted authority C randomly selects two appropriate primes p and q as in RSA, 

and computes n = pq\ 

• C defines as the public exponent an integer v coprime to <p(n ); 

• the values n and v are made public; C keeps p and q secret; 

• each entity X has a unique identity lx from which an integer Jx < n is derived 

using publicly known redundancy rules; 

• for each integer J = Jx, C computes U = ( J) v mod n, and gives to entity X 

the secret W = U~ l = (J)~^ v ^ mod n. (Note: J ■ W v = 1 (modn).) 
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The GQ identification scheme operates as follows to provide unilateral identification of 
prover A to verifier B: 

• entity A with identifier I a selects a random integer r, 1 < r < n— 1, and computes 

the initial witness T = r v mod n; 

• A sends to B the pair of integers (T, I a)', 

• B selects a random challenge d, 0 < d < v—1, and sends d to A ; 

• A computes the response t = r ■ W d mod n, and sends t to B\ 

• B receives t, computes J d ■ t v mod n, and accepts A’s identity as authentic if this 

quantity is equal to T (which it will be if A carried out the protocol properly). 

5. For security reasons: a new random value r must be chosen each time the GQ 
identification scheme is run and the prover must respond to only one challenge d for 
each initial witness T . 

6. A fraudulent prover can defeat the GQ identification scheme with a 1 in v chance 
by guessing d correctly a priori (and then forming T = J d ■ t v mod n as the verifier 
would) . Thus the recommended bit length of v depends on the environment under which 
attacks could be mounted. For a fraudulent prover who must participate locally and is 
subject to being apprehended in person upon failure, 8 to 16 bits may suffice; if remote 
attacks are possible, for example, by telecommunications linkups, 30 or more bits may 
be required. 

7. Extracting nth roots modulo n appears necessary to defeat the GQ identification 
scheme, and this is believed to be intractable in the absence of knowledge of the factor- 
ization of n. 

8. The security of the GQ identification scheme relies on the fact that a fraudulent 
prover has only a 1 in v chance of guessing d correctly a priori (and then forming 
T = J d -t v mod n as the verifier would); and it can be shown that if a fraudulent prover 
is able to correctly respond to two different challenges for the same initial witness, then 
that prover can recover W, i.e. compute a 'c;tli root modulo n, which is believed to be 
infeasible unless the factorization of n is known. 

9. The following algorithm computes a short quantity called a message authentication 
code (MAC), which can be appended to a message as a data integrity mechanism to 
allow the receiver to verify that the message has not been altered by an unauthorized 
party. In addition, this provides a type of symmetric data origin authentication — the 
identity of the party which originated the message can be implicitly verified. MAC 
algorithms such as this have been used in the financial services industry for over 15 
years. The algorithm is set up as follows: 

• let w be the required bit length of the MAC; 

• select a fixed n-bit block cipher algorithm E (for example, DES, yielding n = 64), 

such that w < n ; 

• the originator of the message and the intended recipient must share a secret key 

k for the block cipher E. 

The CBC-based MAC scheme operates as follows to append a keyed checksum to mes- 
sage m for data integrity. The originator generates the MAC as follows: 

• a single 1-bit is appended to the message m (this allows unambiguous recovery 

of the original message even after padding as outlined below); 

• the augmented message is broken into n-bit blocks mp the last block is padded 

by zero or more 0-bits as required to fill it completely; label the resulting blocks 
Xi, ...,xp, 
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• the MAC is defined to be the leftmost w bits of the value c t computed by the 

following sequence of computations: 

ci = -Efc(aq); c* = E k (xi ® Cj_i), 2 < i < t; 

• the message m, along with the MAC, are sent to the recipient in such a format 

that the recipient can separate the MAC from the message; the padding bits 
may or may not be sent along with the message (as agreed by the two parties). 
The MAC verification by recipient is carried out as follows: 

• the recipient receives the message m, adds padding bits as the sender did (if these 

were not transmitted along with m), and computes a MAC on the message using 
the shared key k, as outlined above; 

• the recipient compares the computed MAC to the received MAC; if they agree, 

the recipient is satisfied that the message was not altered during transit (and 
that it originated from the party with whom the key k is shared). 

10. The strength of the MAC algorithm depends on the secrecy and bit length of the 
key k, the strength of the block cipher E, and the bit length w of the MAC. 

11. As an option in the MAC construction, the last block Ct can be subjected to 
additional processing to make the algorithm more resistant to certain types of attacks. 

12. Any digital signature scheme, and, in particular, public key schemes such as RSA, 
can also be used to provide message authentication. 


14.9.11 SECRET SHARING 
Definitions: 

A scheme whereby a secret datum S can be divided up into n pieces, in such a way 
that knowledge of any k or more of the n pieces allows S to be easily recovered, but 
knowledge of fc— 1 or fewer pieces provides no information about S whatsoever (that 
is, no more information than 0 pieces) is called a (k, n ) threshold scheme, and is an 
instance of a more general class of techniques known as secret sharing schemes. 


Facts: 

1. Threshold schemes were introduced by A. Shamir in 1979. 

2. Shamir’s secret sharing scheme: To set up Shamir’s Secret Sharing Scheme, first 
do the following: 

• define an upper bound S max on any secret number S to be shared; 

• define an upper bound n max on the number of participants; 

• select a prime number p which exceeds both ?r max and 5 max . 

To use the scheme to split a secret S so that any k of n users can recover it, the following 
steps are used: 

• splitting up the secret: a trusted party does the following: 

o obtain a secret number S to be shared, S < S max , and define ao = 5; 
o define the number of active participants to be n < n max (additional active 
participants can easily be added later); 
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o define a recovery threshold k < n: 

o select k—1 random integers a*, 1 < i < k— 1, from [0,p— 1], and consider the 
polynomial f(x) = a, a; 1 of degree at most k—1', 

o pick n distinct values Xj, 1 < j < n (for example, Xj = j), and compute 
Sj = f(xj) mod p ( Sj is a share); 

o give the point (xj, Sj) to participant j; Xj can be made public, but the share 
Sj should be a secret revealed only to participant j; 

• to recover the secret : any k of n active participants do the following (without 
loss of generality, label these as participants 1 through k): 
o pool their k shares (xj, Sj), 1 < j < k, allowing the recovery of the coefficients 
of f(x) by polynomial interpolation; 
o from f(x), compute the secret S by evaluating /( 0). 

3. Distributing one piece or share to each of n parties yields a method of distributing 
trust in a secret (such as a cryptographic key) jointly among any fc-subset of them. The 
built-in redundancy of the secret sharing scheme also provides reliability — loss of any 
number of shares that leaves k or more shares remaining does not result in overall loss 
of the secret. 

4. Shamir’s Secret Sharing scheme is based on polynomial interpolation and the fact 
that a polynomial y = fix) of degree k—1 is uniquely defined by any k points ( Xi , yt ) for 
distinct Xj, where f{xi) = iji . By construction, /( 0) = S, that is, S is the j/-intercept 
of the graph y = f{x). No partial information regarding S is obtained from any k—1 
(or fewer) shares because given k—1 shares, a kth point is needed to uniquely deter- 
mine f(x), and each of the p candidate points (0, S) for S in {0, 1, . . . ,p— 1} defines a 
different (equally probable) polynomial. Polynomial evaluation can be done by Horner’s 
rule in k multiplications and k additions. Polynomial interpolation can be done using 
either Lagrange’s formula or Newton’s formula, with the greatest computational cost 
being 0(k 2 ) multiplications or divisions. 


14.9.12 HASH FUNCTIONS 
Definitions: 

A hash function h maps arbitrary length bit strings to small fixed-length (for exam- 
ple, 64 or 128 bit) outputs called hash-values. (See §17.4.1 for a discussion of hash 
functions in a more general setting.) 

A collision is a pair of bit strings mapped to the same output by a hash function. 

The following are common properties a hash function h may have: 

• preimage-resistance : given any y in the range of h (for which a correspond- 

ing input is not known), it should be computationally infeasible to find any 
preimage x* such that h(x*) = y; 

• weak collision-resistance : given any one input x, it should be computationally 

infeasible to find a second preimage x* ^ x such that h(x) = h{x*)\ 

• strong collision-resistance : it should be computationally infeasible to find 

any two distinct inputs, x and x* , such that h(x) = h(x*). 
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A one-way hash function ( OWHF ) is a function h that maps arbitrary length 
inputs to fixed length outputs, and has the properties of preimage-resistance and weak 
collision-resistance . 

A collision-resistant hash function ( CRHF ) is a function h that maps arbitrary 
length inputs to fixed length outputs, and has the property of strong collision-resistance. 

An n-bit hash function is said to have ideal security if the following properties hold: 

• given a hash output, producing both a preimage and a second preimage given a 
first, requires approximately 2™ operations; 

•producing a collision requires approximately 2"/ 2 operations. 


Facts: 

1. The basic idea is that a hash- value serves as a compact representative image (some- 
times called a digital fingerprint , or message digest) of the input string, and can be used 
as if it were uniquely identifiable with that string. 

2 . The problem of checking the integrity of the potentially large original input is re- 
duced to verifying that of a small, fixed-size hash-value. 

3. A hash-value should be uniquely identifiable with a single input in practice, and 
collisions should be computationally difficult to find. 

4 . While the utility of hash functions is widespread, the most common cryptographic 
uses are with digital signatures and for data integrity. 

5 . Regarding digital signatures, long messages are typically hashed first, and then 
the hash-value is signed rather than signing individual blocks of the original message. 
Advantages of this over signing the individual blocks of the original message directly 
include efficiency with respect to both time and space. 

6. Regarding data integrity, hash functions together with appropriate additional tech- 
niques can be used to verify the integrity of data. Specific integrity applications include 
virus protection and software distribution. 

7 . MACs (§14.9.10) are a special class of hash functions, which take in addition to 
message input a secret key as a second input, allowing for the verification of both data 
integrity and data origin authentication. 

8. Given a hash function h and an input x, h(x) should be easy to compute. 

9 . The complete specification of h is usually assumed to be publicly available. 

10. Collision-resistance is required for applications such as digital signatures and data 
integrity, otherwise an adversary might find two messages, x and x' , that have the same 
hash-value, obtain a signature on x, and claim it as a signature on x' . 

11 . Depending on the intended application and the susceptibility of the environment 
to certain attacks, weak or strong collision-resistance may be required. 

12 . There are no known instances of functions that have been proven to be one-way, 
that is, for which it can be proven (without assumptions) that finding a preimage is 
difficult. However, it would be most surprising if such functions indeed did not exist. 
All instances of “one-way functions” given to date should thus properly be qualified as 
“conjectured” or “candidate” one-way functions. 
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13. Most hash functions process fixed-size blocks of the input iteratively as follows: 

• A prespecified starting value or initializing value (IV) is defined. 

• The hash input x = X 1 X 2 ■ ■ ■ Xt of arbitrary finite length is divided into fixed- 

length n-bit blocks x,. This preprocessing typically involves appending extra 
bits (padding) as necessary to extend the input to an overall bit length that 
is a multiple of the block length n. The padding also often includes a partial 
block indicating the bit length of the unpadded input. 

• Each block x. t is then used as input to a simpler hash function / called an m-bit 

compression function , which computes a new intermediate result of some fixed 
bit length m as a function of the previous m-bit intermediate result (initially 
the IV) and the block x*. Letting Hi denote the partial result after the itli 
stage, the hash h(x) of an input x = X 1 X 2 . . . Xt is defined as follows: 

H 0 = IV-, H i = f(H i _ 1 ,x i ), 1 <i <t; h(x) = H t . 

• Hi- 1 serves as the chaining variable between stages i — 1 and i. 

14. Particular hash functions differ in the nature of the compression function and 
preprocessing of the input. 


Examples: 

1. A typical usage for data integrity is as follows: 

• the hash- value corresponding to a particular input is computed at some point in 

time; 

• the integrity of this hash-value is then protected in some manner; 

• at a subsequent point in time, to verify the input data has not been altered, the 

hash- value is recomputed, using purportedly the same input, and compared for 
equality with the original hash-value. 

2. Matyas-Meyer-Oseas hash function: Let E be an n-bit block cipher, such as DES, 
parametrized by a symmetric key k, and let g be a function that maps an n-bit string 
to a key k suitable for E. Fix an initial value IV. The following algorithm is then an 
n-bit hash function which, given any input string x, outputs an ?r-bit hash h(x): 

• divide x into n-bit blocks and pad if necessary by some method such that all 

blocks are complete, yielding a padded message of t n-bit blocks X 1 X 2 . . . xp, 

• define h(x) = H t where: 

Hq = IV ; Hi = E g ( Hi _p > (xi) © Xj, 1 < i < t. 

This is believed to be a one-way hash function requiring 2" operations to find a preimage, 
and 2 n / 2 operations to find a collision. For underlying ciphers, such as DES, which have 
relatively small blocklength (for example, with blocks of no more than 64 bits), this 
is not a collision-resistant hash function since 2 32 operations is well within current 
computational capability. 


© 2000 by CRC Press LLC 



REFERENCES 


Printed Resources: 

[Ba97] J. Baylis, Error- Correcting Codes: A Mathematical Introduction , Chapman & 
Hall, 1997. 

[BePi82] H. Beker and F. Piper, Cipher Systems: The Protection of Communications, 
Wiley, 1982. 

[Be84] E. R. Berlekamp, Algebraic Coding Theory (Revised 1984 ed.), Aegean Press, 
1984. 

[B183] R. E. Blahut, Theory and Practice of Error Control Codes , Addison- Wesley, 1983. 

[Br88] G. Brassard, Modern Cryptology: a Tutorial , Springer- Verlag, 1988. 

[DaPr89] D. W. Davies and W. L. Price, Security for Computer Networks, Wiley, 1989. 

[De83] D. E. Denning, Cryptography and Data Security, Addison- Wesley, 1983. 

[DiHe76] W. Diffie and M. E. Heilman, “Multiuser cryptographic techniques”, Proceed- 
ings of AFIPS National Computer Conference, 109-112, 1976. 

[Gi98] J. Gilmore, editor, Cracking DES: Secrets of Encryption Research, Wiretap Pol- 
itics & Chip Design, O’Reilly, 1998. 

[GoPi91] C. M. Goldie and R. G. E. Pinch, Communication Theory, Cambridge Uni- 
versity Press, 1991. 

[Ha80] R. W. Hamming, Coding and Information Theory, Prentice-Hall, 1980. 

[HaHaJo97] D. Hankerson, G. A. Harris, and P. D. Johnson, Jr., Introduction to Infor- 
mation Theory and Data Compression, CRC Press, 1997. 

[HeWi99] C. Heegard and S. B. Wicker, Turbo Coding, Kluwer, 1999. 

[Hi86] R. Hill, A First Course in Coding Theory, Oxford, 1986. 

[HoEtal92] D. G. Hoffman, D. A. Leonard, C. C. Lindner, K. T. Phelps, C. A. Rodger 
and J. R. Wall, Coding Theory: The Essentials, Marcel Dekker, 1992. 

[Ko94] N. Koblitz, A Course in Number Theory and Cryptography , second edition, 
Springer- Verlag, 1987. 

[LiCo83] S. Lin and D. J. Costello, Jr., Error Control Coding: Fundamentals and Ap- 
plications, Prentice-Hall, 1983. 

[MaS177] F. J. Mac Williams and N. J. A. Sloane, The Theory of Error-Correcting Codes, 
North-Holland, 1977. 

[Mc77] R. J. McEliece, The Theory of Information and Coding, Addison- Wesley, 1977. 

[MevaVa96] A. J. Menezes, P. C. van Oorschot and S. A. Vanstone, Handbook of Applied 
Cryptography, CRC Press, 1996. 

[PeWe72] W. W. Peterson and E. J. Weldon, Jr., Error-Correcting Codes, MIT Press, 
1972. 


© 2000 by CRC Press LLC 



[Pi88] P. Piret, Convolutional Codes , MIT Press, 1988. 

[P189] V. Pless, Introduction to the Theory of Error-Correcting Codes, Wiley, 1989. 

[PlHuBr98] V. Pless, W. C. Huffman, and R. A. Brualdi, Handbook of Coding Theory , 
Elsevier, 1998. 

[Pr92] O. Pretzel, Error-Correcting Codes and Finite Fields , Clarendon Press, 1992. 

[Re94] F. M. Reza An Introduction to Information Theory, Dover, 1994. 

[Ro96] S. Roman, Introduction to Coding and Information Theory, Springer-Verlag, 
1996. 

[Ru86] R. A. Rueppel, Analysis and Design of Stream Ciphers, Springer-Verlag, 1986. 

[Sa90] A. Salomaa, Public-Key Cryptography, Springer-Verlag, 1990. 

[Sc96] B. Schneier, Applied Cryptography: Protocols, Algorithms, and Source Code 
in C, 2nd ed., Wiley, 1996. 

[SePi89] J. Seberry and J. Pieprzyk, Cryptography: An Introduction to Data Security, 
Prentice-Hall, 1989. 

[Si93] G. J. Simmons, ed., Contemporary Cryptology: the Science of Information In- 
tegrity, IEEE Press, 1992. 

[St95] D. Stinson, Cryptography: Theory and Practice, CRC Press, 1995. 

[va90] J. H. van Lint, “Algebraic geometric codes”, Coding Theory and Design Theory, 
Part 1, Springer-Verlag, 1990, 137-162. 

[va99] J. H. van Lint, Introduction to Coding Theory, third edition, Springer-Verlag, 
1999. 

[Vava89] S. A. Vanstone and P. C. van Oorschot, An Introduction to Error Correcting 
Codes with Applications, Kluwer Academic Publishers, 1989. 

[We98] R. B. Wells, Applied Coding and Information Theory for Engineers, Prentice- 
Hall, 1998. 

[We88] D. Welsh, Codes and Cryptography, Clarendon Press, 1988. 

Web Resources: 

ftp://ftp.funet.fi/pub/crypt (Cryptographic Software Archive.) 

http://imailab.iis.u-tokyo.ac.jp/~robert/codes.html (The Error Correcting 
Codes (ECC) Home Page.) 

http ://rschp2. anu.edu.au: 8080/cipher. html (Some Classic Ciphers and Their 
Weaknesses.) 

http : //www. achiever . com/freehmpg/ cryptology/crypto .html (A-Z Cryptology.) 

http://www.cacr.math.uwaterloo.ca/hac/ (Handbook of Applied Cryptography 
Site.) 

http://www.certicom.com (Certicom Corporation.) 


© 2000 by CRC Press LLC 


http://www.cs.hut.fi/ssh/crypto/intro.html (Introduction to Cryptography.) 
http : / / www . entrust . com (Entrust Technologies.) 
http://www.ntru.com (NTRU Cryptosystems.) 
http://www.rsa.com/rsalabs/ (RSA Laboratories.) 

http://www331.jpl.nasa.gov/public/JPLtcodes.html (JPL Turbo Codes Page.) 


© 2000 by CRC Press LLC 



DISCRETE OPTIMIZATION 


15.1 Linear Programming 

15.1.1 Basic Concepts 

15.1.2 Tableaus 

15.1.3 Simplex Method 

15.1.4 Interior Point Methods 

15.1.5 Duality 

15.1.6 Sensitivity Analysis 

15.1.7 Goal Programming 

15.1.8 Integer Programming 

15.2 Location Theory 

15.2. 1 p-Median and p-Center Problems 

15.2.2 p-Medians and p-Centers on Networks 

15.2.3 Algorithms for Location on Networks 

15.2.4 Capacitated Location Problems 

15.2.5 Facilities in the Plane 

15.2.6 Obnoxious Facilities 

15.2.7 Equitable Locations 

15.3 Packing and Covering 

15.3.1 Knapsacks 

15.3.2 Bin Packing 

15.3.3 Set Covering and Partitioning 

15.4 Activity Nets 

15.4. 1 Deterministic Activity Nets 

15.4.2 Probabilistic Activity Nets 

15.4.3 Complexity Issues 

15.5 Game Theory 

15.5. 1 Noncooperative Games 

15.5.2 Matrix and Bimatrix Games 

15.5.3 Characteristic-Function Games 

15.5.4 Applications 

15.6 Sperner’s Lemma and Fixed Points 

15.6.1 Sperner’s Lemma 

15.6.2 Fixed-Point Theorems 


Beth Novick 


S. Louis Hakimi 


Sunil Chopra and 
David Simchi-Levi 


S. E. Elmaghraby 


Michael Mesterton-Gibbons 


Joseph R. Barr 


© 2000 by CRC Press LLC 


INTRODUCTION 


This chapter discusses various topics in discrete optimization, especially those that arise 
in applying operations research techniques to applied problems. Linear programming 
provides a fundamental operations research tool for studying, formulating, and solving 
a number of combinatorial optimization problems — either exactly or approximately. 
For example, linear programming is an important tool in solving packing and covering 
problems, in which a given resource must be optimally utilized subject to constraints. 
Location theory studies the optimal placement of facilities in order to service a finite 
number of customers on a network or in the plane. Activity networks are commonly used 
in the planning and scheduling of interrelated activities to complete a project; in this 
case the completion time, resources used, and total cost are important considerations. 
Game theory is a discipline with applications to many areas, in which several agents 
compete or cooperate to maximize their respective gains. Fixed-point theorems have 
applications to economics, nonlinear optimization, and game theory. 


GLOSSARY 

active constraint : an inequality satisfied with equality by a given vector. 

balanced matrix: a 0-1 matrix having no square submatrix of odd order with exactly 
two Is in each row and column. 

basic feasible solution (of an LP): a basic solution that is also a feasible solution. 

basic solution (of an LP): a solution obtained by setting certain nonbasic variables 
to zero and solving for the remaining basic variables. 

bin packing problem : an optimization problem in which a given set of items are to 
be packed using the fewest number of bins. 

bounded LP: a linear programming problem having a finite optimal solution. 

capacitated location problem: a location problem in which bounds are placed on 
the amount of demand that can be handled by individual facilities. 

p-center: a set of p locations for facilities that minimizes the maximum distance from 
any demand point to its closest facility. 

characteristic function: a mapping from the set of all coalitions to the nonnegative 
real numbers. 

characteristic- function game: a model for distributing a cooperative benefit fairly 
among players when the concept of fairness is based on the bargaining strengths of 
coalitions that could form if the players had not already agreed to cooperate. 

coalition: any subset of the players in a game. 

complete information: a situation arising when a game’s structure is known to all 
players. 

convex hull: the smallest convex set containing a given set of points. 

convex set: a set containing the line segment joining any two of its points. 

CPM model: a deterministic activity net with strict precedence among the activities. 

critical path: a sequence of activities that determines the completion time of a project. 

criticality index: the probability that a given path (activity) is (lies on) a critical 
path of a project. 


© 2000 by CRC Press LLC 



cutting plane : a constraint that can be added to an existing set of constraints without 
excluding any feasible integer solution. 

decision variables: the unknowns in an optimization problem. 

demand point: a point in a metric space that is a source of demand for the service 
provided by the facilities. 

deterministic activity net: a directed network in which all the parameters (such as 
duration, resource requirements, precedence) are known deterministically. 

dual LP: a minimization LP problem associated with a given maximization LP prob- 
lem. 

equilibrium: a strategy combination from which no player has a unilateral incentive 
to depart. 

facility: a place where a service (or product) is provided. 

facility location: a point in a metric space where a facility is located. 

feasible direction: a direction that preserves feasibility in a sufficiently small neigh- 
borhood of a given feasible solution. 

feasible LP: an LP with a nonempty feasible region. 
feasible region: the set of all feasible solutions to a given LP. 
feasible solution: a vector that satisfies the given set of constraints. 
fixed point (of a function): given a function /, a point x such that f(x) = x. 

float: in a deterministic activity net, a measure of the flexibility available in scheduling 
an activity without delaying the project completion time. 

c-game: a characteristic-function game. 

GAN model: a probabilistic activity net with conditional progress and probabilistic 
realization of activities. 

general position: a set of points Xi, X2, ■ ■ ■ , x p+ \ £ lZ n such that the vectors X2 — X1, 
X3 — Xi, . . . , x p+ i — Xi are linearly independent. 

GERT model: a probabilistic activity net with exclusive-or branching. 
goal programming ( GP ) problem: an LP having multiple objective functions. 
improving direction : a feasible direction that improves the objective function value. 
imputation: a distribution among players of the cooperative benefit in a c-game. 
infeasible LP: an LP with an empty feasible region. 

integer programming (IP) problem: a linear programming problem in which some 
of the decision variables are required to be integers. 

interior point method: a technique for solving an LP that iteratively moves through 
the interior of the feasible region. 

knapsack problem : an optimization problem in which items are to be selected to 
maximize the total benefit without exceeding the capacity of the knapsack. 

linear programming (LP) problem : an optimization problem involving the selec- 
tion of decision variables that optimize a given linear function and that satisfy linear 
inequality constraints. 

location problem: an optimization problem in which p facilities are to be established 
to minimize the cost of meeting known demands arising from n locations. 
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LP relaxation : the linear programming problem obtained by dropping the integrality 
requirements of an IP. 

p-median : a set of p locations for facilities that minimizes the total (transportation) 
cost of satisfying all demands. 

metric space: a set of points on which a distance function has been defined. 

mixed strategy : a probability distribution over a set of pure strategies. 

noncooperative game: a mathematical model of strategic behavior in the absence 
of binding agreements. 

normalized characteristic function: a mapping from the set of all coalitions to 

[ 0 , 1 ]- 

nucleolus: a c-game solution concept based on minimizing the dissatisfaction of the 
most dissatisfied coalitions. 

objective function: the function associated with a given optimization problem that 
is to be maximized or minimized. 

optimal solution: a feasible solution to an optimization problem achieving the largest 
(or perhaps smallest) value of the objective function. 

packing: a subset of items from a given list that can be placed in a bin of specified 
capacity. 

payoff function: a mapping from the set of feasible strategy combinations to 7 Z n , 
where n is the number of players. 

perfect information: a situation arising when the history of a game is known to all 
players. 

PERT model: a probabilistic activity net with strict precedence and activity dura- 
tions that are known only in probability. 

pivot: a move from a given basic solution of an LP to one differing in only one active 
constraint. 

players: a collection of interacting decisionmakers. 

polyhedron: the set of points satisfying a given finite set of linear inequalities. 

probabilistic activity net: a directed network in which some or all of the parameters, 
including the realization of the activities, are probabilistically known. 

pure strategy: a plan of action available to a player. 

reduced cost: the unit change in the objective function incurred by increasing the 
value of a given decision variable. 

redundant constraint : a constraint that can be removed from a given set of con- 
straints without changing the set of feasible solutions. 

set cover: a family of subsets such that each of a specified list of elements is contained 
in at least one subset. 

set covering problem: an optimization problem in which a minimum cost set cover 
is needed. 

set partition: a family of subsets such that each of a specified list of elements is 
contained in exactly one subset. 

set partitioning problem: an optimization problem in which a minimum cost set 
partition is needed. 
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Shapley value : a c-game solution concept based on players’ marginal worths to coali- 
tions on joining, assuming all orders of formation are equally likely. 

p-simplex : the convex hull of a collection of p + 1 points in general position. 

simplex method: a technique for solving an LP that moves from vertex to neighboring 
vertex along the boundary of the feasible region. 

simplicial subdivision (of a simplex): a decomposition of the simplex into a collection 
of simplices that intersect only along entire common faces. 

slack variables: the components of b — Ax* where x* is a feasible solution to an LP 
with constraints Ax < b, x > 0. 

solution: an equilibrium or set of equilibria in a noncooperative game, or an imputa- 
tion or set of imputations in a c-game. 

strategic behavior: behavior such that the outcome of an individual’s actions de- 
pends on actions yet to be taken by others. 

strategy combination: a vector of strategies, one for each player. 

tableau: a table storing all information pertinent to a given basic solution for an LP. 

totally unimodular matrix: a 0-1 matrix such that every square submatrix has 
determinant 0, +1, or —1. 

unbounded LP: a linear programming problem that is not bounded. 

vertex (of a feasible region): given a feasible region S, a point x € S C lZ n defined by 
the intersection of exactly n linearly independent constraints. 


15.1 LINEAR PROGRAMMING 

Linear programming involves the optimization of a linear function under linear inequal- 
ity constraints. Applications of this model are widespread, including problems arising in 
marketing, finance, inventory, capital budgeting, computer science, transportation, and 
production. Algorithms are available that, in practice, solve LP problems efficiently. 


15.1.1 BASIC CONCEPTS 
Definitions: 

A linear programming (LP) problem is an optimization problem that can be written 

maximize: cx 

subject to: Ax < b 

where A is a given q x n matrix, c is a given row vector of length n, and & is a given 
column vector of length q. The decision variables of problem (1) are represented by 
the column vector x of length n. 

A feasible solution is a vector x satisfying Ax < b. The feasible region is the subset 
of all feasible solutions in lZ n . If no feasible solution exists (so that the feasible region 
is empty), the LP problem is infeasible; otherwise it is feasible. 

Each of the q inequalities in Ax < b is a constraint. A constraint is redundant if 
removing it from (1) doesn’t change the feasible region. 
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For a feasible solution x, the function z = cx is the objective function, with cx the 
objective value of x. When the objective value z* = cx* is also maximum, then the 
feasible x* is an optimal solution. If the objective value can be made arbitrarily large 
over the feasible region, the LP problem is unbounded. Otherwise it is bounded. 

A vector y is a feasible direction at x* if there is some r > 0 such that A(x* + A y) < b 
for all 0 < A < r. If cy > 0 also holds, then y is an improving direction. 

A constraint of the system Ax < b that is satisfied with equality by a feasible solution x* 
is active at x*. 

A set of constraints {diX <bi \ i = 1, 2, . . . , k} is linearly independent if the vectors 
{ai, (i 2 > • • • , Ofc} are linearly independent (see §6.1.3). 

A vertex is a feasible solution with n linearly independent active constraints. A vertex 
with more than n active constraints is degenerate. An LP problem with a degenerate 
vertex is degenerate. 

A set S is convex if the line segment joining any two of its points is contained in S: 
i.e., for all x,y € S and 0 < A < 1, then Ax + (1 — A )y € S. 

Let L be the line segment connecting the two vertices x 1 and x 2 . Then x 1 and x 2 are 
adjacent if for all points y ^ x 1 , x 2 on L and all feasible y 1 and y 2 , the only way y can 
equal \ y 1 + \y 2 is if y 1 and y 2 are also on L. In this case, L is an edge. 


Facts: 

1. Linear programming models arise in a wide variety of applications, which typically 
involve the allocation of scarce resources in the best possible way. A sample of such 
application areas, with reference sources, is given in the following table. 


application 

references 

production scheduling and inventory control 

[Ch83] , [Ga85] 

tanker scheduling 

[BaJaSh90] 

airline scheduling 

[Ga85] 

cutting stock problems 

[BaJaSh90] , [Ch83] 

workforce planning 

[Ga85] 

approximation of data 

[Ch83] 

matrix games 

[Ch83] 

blending problems 

[BaJaSh90] 

petroleum refining 

[Ga85] 

capital budgeting 

[BaJaSh90] 

military operations 

[Ga85] 

land use planning 

[Ga85] 

agriculture 

[Ga85] 

banking and finance 

[Ga85] 

environmental economics 

[Ga85] 

health care 

[Ga85] 

marketing 

[Ga85] 

public policy 

[Ga85] 
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2 . The general concepts of linear programming were first developed by G.B.Dantzig 
in 1947 in connection with military planning problems for the U. S. Air Force. Earlier, 
in 1939, L. V. Kantorovich formulated and solved a particular type of LP problem in 
production planning. 

3 . The term “linear programming” conveys its historical origins and purpose: it is a 
mathematical model involving linear constraints and a linear objective function, used 
for the optimal planning ( programming ) of operations. 

4. Form (1) of an LP naturally occurs in the selection of levels for production activities 
that maximize profit subject to constraints on the utilization of the given resources. 

5. These transformations on an LP do not change feasible (or optimal) solutions: 

• constraints: change the sense of an inequality by multiplying both sides by — 1; 

or replace a,; a: = bi with cqx < bi and — diX < —6$; or replace cqx < bi with 
a,iX + Si = bi and s* > 0; 

• variables: for Xj unrestricted, set Xj := x'j — x" with x'j , x',j > 0; or for Xj < 0, 

set Xj := — x'j with x'j > 0; 

• objective function: change a minimization (maximization) problem to a maxi- 

mization (minimization) problem by setting c := — c. 

6. Farkas’ lemma: Suppose A is a q x n matrix and c is an n-row vector. Then the 
following are equivalent: 

• cy > 0 for all y G lZ n such that Ay > 0; 

• there exists some u £ 1Z q such that u > 0, c = uA. 

This result is important in establishing the optimality conditions for linear programming 
problems; it can also be applied to show the existence (and uniqueness) of solutions 
to linear models of economic exchange and stationary distributions in finite Markov 
chains (§7.7). (J. Farkas, born 1902) 

7. A feasible solution with an improving direction can not be optimal for (1). 

8. A feasible solution with no improving direction is always optimal for (1). 

9. If a feasible solution to (1) has an improving direction y and if Ay < 0 then the LP 
problem is unbounded. 

10 . Each LP problem is either infeasible, unbounded, or has an optimal solution. This 
need not be the case for nonlinear optimization problems. 

11 . Form (1) of an LP is helpful for understanding the geometric properties of an LP. 

12 . For algorithmic purposes the following form, form (2), of an LP is preferred: 

maximize: cx 

subject to: Ax < b (2) 

x > 0 

Here A is an m x n matrix. 

13 . The most general form of an LP problem is: 

maximize (or minimize): dx\ + ex2 + fx 3 

subject to: Ax 1 + Bx 2 + Cx 3 < o 

Dx 1 + Ex 2 + Fx 3 > b 
Gx\ T Fix 2 T Ex$ = c 
Xi >0, X 2 < 0, X 3 unrestricted. 
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In this formulation A, B, C , D, E , F, G , H , and K are matrices; a, 6, and c are column 
vectors; and d, e, and / are row vectors. 

14 . The feasible region of an LP problem is a convex set. 

15 . Equivalence of forms : The general form in Fact 13 is equivalent to both form (1) 
and form (2) in the following sense: any of these three forms can be transformed into 
another using the operations of Fact 5. Each form possesses the same set of feasible (or 
optimal) solutions. 

16. An excellent glossary of linear programming terms, as well as concepts in general 
mathematical optimization, can be found at the site: 

• http : / / www-math . cudenver . edu/~hgreenbe/ glossary/ glossary . html 


Examples: 

1. Feed mix: A manufacturer produces a special feed for farm animals. To ensure 
that the feed is nutritionally balanced, each bag of feed must supply at least 1250 mg 
of Vitamin A, 250 mg of Vitamin B, 900 mg of Vitamin C, and 232.5 mg of Vitamin D. 
Three different grains (1,2,3) are blended to create the final product. Each ounce of 
Grain 1 supplies 2, 1, 5, 0.6 mg of Vitamins A, B, C, D, respectively. Each ounce of 
Grain 2 provides 3, 1, 3, 0.25 mg of Vitamins A, B, C, D, while each ounce of Grain 3 
provides 7, 1 mg of Vitamins A, D. The costs (per ounce) of the constituent grains are 
41, 35, and 96 cents for Grains 1, 2, and 3, respectively. 

The manufacturer wants to determine the minimum cost mix of grains that sat- 
isfies all four nutritional requirements. If Xi is the number of ounces of Grain i that 
are blended in the final product, then the manufacturer’s problem is modeled by the 
following LP: 

minimize: 0.41xi + 0.35x2 + 0.96x3 

subject to: 2xi + 3x 2 + 7x3 > 1250 

Xi + x 2 > 250 

5xi + 3x 2 > 900 

0.6xi + 0.25x 2 + X3 > 232.5 
x\, x 2 , x 3 > 0. 

Each constraint in this LP corresponds to a nutritional requirement. It turns out that 
the optimal solution to the LP is x\ = 200.1, x?i = 49.9, x| = 100.01 with z* = 195.5. 
Note that the amount of Vitamin C supplied by this solution is in excess of 900 mg, 
while the other vitamins are supplied in exactly the minimum amounts. 

2 . The LP in Example 1 is not in either form (1) or form (2). However, using Fact 5 
it can be transformed into form (2), giving the equivalent representation: 


maximize: 

— 0.41xi — 0.35x 2 — 0.96x3 


subject to: 

— 2xi — 3x 2 — 7x3 < 

-1250 


— Xi — x 2 < 

-250 


— 5xi — 3x 2 < 

-900 


— 0.6xi — 0.25x 2 — X3 < 

-232.5 


Xi, x 2 , x 3 > 

0 . 
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3. The following figure shows the feasible region of the LP problem: 


maximize: —x± 

subject to: — x 2 < 0 (A) 

—xi — x 2 < — 4 (B) 

—x\ + x 2 < 4 (C) 

— 3xi + 5a’2 < 30 (D) 

—x\ + 3x2 < 22 (E) 

This LP has n = 2 decision variables x\,x 2 and it has the vertices (4, 0), (0, 4), and (5, 9). 
Vertex (x’i,X 2 ) = (0,4) is the optimal solution, achieving the maximum objective value 
z = 0. Thus, the LP is bounded, even though its feasible region is not bounded. 
Constraint (D) is redundant, since dropping it doesn’t change the feasible region. Ver- 
tex (5, 9) is degenerate, since 3 > n constraints are active at this vertex. All vectors are 
feasible directions at (6,5). At vertex (5,9), the direction (1,-1) is feasible, but the 
direction (1,1) is not. Vertices (0,4) and (5,9) are adjacent, as are (4,0) and (0,4). 



The set Y = { y \ Ay > 0 } is the region bounded by the two dashed lines. Notice that 
if c = uA for some u > 0 then c must lie in the cone C bounded by the vectors a 1 
and a 2 . Geometrically, any c € C makes an acute angle with every y £ Y , hence cy > 0. 
Conversely, any c making an acute angle with every y £ Y must be in C. 
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15 . 1.2 


5. Fact 10 is illustrated using the following LP problem: 

maximize: — x± — X 2 

subject to: — 2a: i + £2 < — 1 
— Xi — 2x2 < —2 

xi, x 2 > 0 . 

This LP has the optimal solution = (g, |). Suppose the objective function is 

changed to z = x\ — x 2 - Then (£ 1 , 0 : 2 ) = (a, 0) is feasible for a > 2 with objective 
value a. Thus z can be made arbitrarily large and the LP is unbounded. On the other 
hand, if the second constraint is changed to 4xi — x 2 < — 1, the feasible region is empty 
and the LP is infeasible. 

6. Product mix: A company manufactures n types of a product, using m shops. Type j 
requires aij machine-hours in shop i. There is a limitation of bt machine-hours for 
shop i and the sale of each type j unit brings the company a profit Cj . The optimization 
problem facing the company is given by an LP problem in form (2). Namely, if Xj is 
the number of units produced of type j, then the optimization problem is 

n 

maximize: c j x j 

j=i 

n 

subject to: a.ijXj < bi, i = 1, . . . , m 

J=i 

Xj > 0, j = l,...,n. 

7. Transportation: A product stored at m warehouses needs to be shipped to satisfy 
demands at n markets. Warehouse i has a supply of s-i units of the product, and market j 
has a demand of dj units. The cost of shipping a unit of product from warehouse i to 
market j is Cjj . The problem is to determine the number of units Xjj to ship from 
warehouse i to market j in order to satisfy all demands while minimizing cost: 

m n 

minimize: ^ Yh c ij x ij 

Sy=i 

n 

subject to: X) x ij — s ii i = 1, . . . ,m 

j = 1 

m 

x ij dj i j 1 , . . . , n 
i= 1 

Xij > 0, i = 1, . . . , m, j = 1, . . . , n. 

This is an LP in the form specified by Fact 13. Using the transformations in Fact 5, 
this optimization problem can alternatively be expressed as an LP in the form (1). 


TABLEAUS 


Definitions: 

Suppose that an LP is expressed in form (2) of §15.1.1, with A an m x n matrix. A 


tableau is any table 


u 

z 

D 

f 


with the following properties: 
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• D is an mx ( m+n ) matrix with entries dtj] z is a real number; u is an (m+n)-row 

vector; and / is an m-column vector. 

• Associated with the tableau is a partition Eg, Ejv of the integers 1, . . . , m + n. 

The set E^, with cardinality m, is the basic set and Ejv is the nonbasic set. 

• For every row index t = 1 , ... ,m, there is a column of D equal to zero in all 

coordinates except for the ith coordinate, which equals 1. The index of this 
column is ip(t) where <p is a function from {1 , . . . , m} to E^ associated with the 
tableau. 

• — "jpr % — can be obtained from ^ 7 7 — (where E^ = in + 

D j Alb 

1 ,... ,m + n} and = n + t, t = 1 , . . . , m) by performing the following 
pivot operation a finite number of times: 

PI. choose a row index t* £ {1, . . . , to} and a column index j* £ E^v with 
dfj * ^ 0; 

P2. multiply row t* by l/(d t *j*); 

P3. add appropriate multiples of row t* to all other rows to make up =0 and 
to make d t j * = 0 for all t^t*] 

P4. remove j* from Ejv and place it in E#; remove p(t*) from Eg and place 
it in E w ; set (p{t*) = j*. 

In the pivot operation, before replacement, is the index of the leaving variable 

and j* is the index of the entering variable. 

The set of variables { Xi \ i £ E^ } are the basic variables and the remaining variables 

are the nonbasic variables. 

A basic solution is a vector x* with its basic variables defined by x* t = ft. where 
t = <p -1 (i); its nonbasic variables have x* = 0. If / > 0 then x* is a basic feasible 
solution ( BFS ). 

The basis matrix B is the m x m matrix consisting of the columns of [A I ] corre- 
sponding to the basic variables; the nonbasis matrix N is the m x n matrix corre- 
sponding to the nonbasic variables. Let cb [cjv] denote the vector of basic [nonbasic] 
components of c. Let Xb [a: at] denote the vector of basic [nonbasic] components of x. 

The reduced cost of nonbasic variable Xj is the negative of Uj in the associated tableau. 

The slack variables are given by (x n +i,x n + 2 , ■ ■ ■ ,x n + m ) = b — Ax. 


Facts: 

~ r a 

1. Every BFS of (2) corresponds to a vertex of (1), where q = m + n, A = ^ , and 



2 . In the absence of degeneracy the correspondence in Fact 1 is one-to-one; otherwise 
it is many-to-one. 

3 . Every LP problem (2) with an optimal solution has an optimal solution that is a 
vertex. Since the number of vertices is finite, LP problems are combinatorial in nature; 
that is, an LP can be solved in theory by enumerating its vertices and then selecting 
one with maximum objective function value. 
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4. Let x* be a BFS of (2). All information pertinent to x* is contained in its tableau 
which (after possibly permuting the first m + n columns) is 


0 cqJB l N — Cjv 

cbB 1 b 

I B-'N 

B~ 1 b 


Here CbB 1 & is the objective value z of x*. The value of the basic variable x* is the 
fth component of B~ l b 1 where i = ip(t). Every nonbasic variable has value 0. 

5. A tableau expresses the set of equations below, called a dictionary [Ch83]: 

Xb = B~ x b — B~ x Nxn 
z = csB~ 1 b+ (cjv — CbB~ 1 N)xn- 

6 . The reduced costs of the nonbasic variables are given by the vector cn — CbB~ 1 N. 
Basic variables have zero reduced cost. 

7. Column i of B is identical to d m+l , the column of D associated with the slack 
variable x n +j. 


Examples: 

1. When slack variables X4,Xs,X6 are added to the LP 

maximize: 3xi + 4x2 + 4x3 
subject to: 3xi — £3 < 5 
— 9x\ + 4x’ 2 + 3x3 < 12 
— 6x1 T 2x2 T 4x3 — 2 
Xi,x 2 ,x 3 > 0 

the following equivalent LP is formed: 

maximize: 3xi + 4x 2 + 4x’3 + Ox’4 + OX5 + 0x6 
subject to: 3xq — X3 + X4 = 5 

— 9xi + 4x2 + 3x3 + X 5 =12 
—6x1 + 2x 2 + 4x 3 + X6 = 2 
xi,x 2 , . . . ,x 6 > 0. 


The associated tableau, with E b 

= {4,5,6} 

and E 

N ~ 

{1,2,3}. 

-3 

1 

1 

0 

0 

0 

0 

3 

0 -1 

1 

0 

0 

5 

-9 

4 3 

0 

1 

0 

12 

-6 

2 4 

0 

0 

1 

2 


Here <p( 1) = 4, ip( 2) = 5, y>(3) = 6. The basic variables are X4 = 5, X5 = 12, X6 = 2 
and the nonbasic variables are xi = 0, x 2 = 0, X3 = 0. The basic feasible solution 
associated with this tableau is x = (0, 0, 0, 5, 12, 2) T with objective value z = 0. The 
nonbasic variables Xi,X2,X3 have reduced costs 3,4,4, respectively. 

2. A pivot is now performed on the tableau in Example 1 using t* = 3 and j* = 2, 
so the entering variable is x 2 and the leaving variable is X6- The resulting tableau (a) 
follows, where E b = {2,4,5} and <p(l) = 4, ip(2) = 5, <p(3) = 2. The corresponding 
BFS is x = (0, 1, 0, 5, 8, 0) T with objective value z = 4. If a pivot is performed on (a) 
using t* = 1 and j* = 1, then tableau (b) results. Here ip( 1) = 1, <p( 2) = 5, y>(3) = 2 
and the new BFS is x = (§, 6, 0, 0, 3, 0) T with objective value z = 29. 
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tableau (a) 


-15 

0 

4 

0 

0 

2 

4 

3 

0 

-1 

1 

0 

0 

75 

3 

0 

-5 

0 

1 

-2 

8 

-3 

1 

2 

0 

0 

1 

2 

1 


tableau (b) 


0 

0 

-1 

5 

0 

2 

29 

1 

0 

1 

3 

1 

3 

0 

0 

5 

3 

0 

0 

-4 

-1 

1 

-2 

3 

0 

1 

1 

1 

0 

1 

2 

6 


For tableau (b), the basis matrix B corresponds to columns 1,5,2 of 


( 3 

0 

°\ 1 

( \ 

0 

B= -9 

1 

4 . From Fact 7, the inverse matrix B 1 = 1 

-1 

1 

1-6 

0 

2 ' 

v 1 

0 


of columns 4, 5, 6 in tableau (b). 



namely 

consists 


15.1.3 SIMPLEX METHOD 

The simplex method is in practice remarkably efficient and it is widely used for solving 
LP problems. The solution idea dates back to J. B. J. Fourier (1768-1830); it was devel- 
oped and popularized in 1947 by G. B. Dantzig (born 1914). This section presents two 
descriptions of the same algorithm — the first geometrically intuitive, the second closer 
to its actual implementation. 

Facts: 

1. Simplex algorithm I: This method (Algorithm 1) solves a linear programming prob- 
lem in form (1) of §15.1.1. Assuming that an initial vertex is known, this algorithm 
travels from vertex to vertex along improving edges until an optimal vertex is reached 
or an unboundedness condition is detected. In Algorithm 1, the rows of A are denoted 
ai, «2, . . . , a q and b has the corresponding components for, 62, . . . , b q . 

2 . Simplex algorithm II: This method (Algorithm 2) solves a linear programming 
problem in form (2) of §15.1.1, assuming that b > 0. It proceeds by successively identi- 
fying nonbasic variables having positive reduced cost and pivoting them into the current 
basis in a way that maintains a basic feasible solution (BFS). 

3 . There are examples for which Algorithm 1 requires exponential running time, and 
similarly for Algorithm 2. 

4 . In practice the number of iterations of Algorithms 1 and 2 is proportional to the 
number of constraints m and grows slowly with the number of variables n. 

5 . There is a one-to-one correspondence between the vertex Xk of Algorithm 1 and the 
BFS Xk of Algorithm 2, when A is set to 

6 . Interchanging a basic and a nonbasic variable in Algorithm 2 corresponds to inter- 
changing a nonactive and an active constraint in Algorithm 1. 

7 . In the absence of degeneracy, the objective value strictly increases at each step (in 
both algorithms). The method of breaking ties by choosing the smallest index prevents 
cycling and ensures termination in finite time. In practice, though, cycling is rare and 
other rules are used. 

8. When a vertex is not known in Algorithm 1 (when b in Algorithm 2) a prelimi- 
nary LP problem, Phase I, can be solved to get an initial vertex (a starting tableau). 


A 


b 

-/ 

and b is set to 

0 


© 2000 by CRC Press LLC 



Algorithm 1: Simplex algorithm — form (1). 

input: LP in form (1), initial vertex xo 

output: an optimal vertex or an indication of unboundedness 

k := 0 

find a subsystem Bx < r of (1) consisting of n linearly independent constraints 
active at Xk 

S := list containing the indices of these active constraints 

{Main loop} 

if u = cB ~ 1 > 0 then Xk is an optimal solution — stop 

else {an improving direction} 

i* := the smallest index such that it,* < 0 
y := column i* of —B^ 1 

if Ay < 0 then the LP problem is unbounded — stop 
else {move to next vertex (possibly same as last)} 

j* := smallest index j attaining minimum A = min{ bj a ai y Xk \ j / $, ajy > 0} 
Xk + 1 := x k + A y 
£[**] := j*', update B 
k := k + 1 

{Continue with next iteration of main loop} 


Algorithm 2: Simplex algorithm — form (2). 


input: LP in form (2), with b > 0 

output: an optimal BFS or an indication of unboundedness 


begin with the initial tableau: , , 

6 Alb 

and ip{t ) = n + t, t = 1, . . . , m 

xq := (xb,x n) where Xb = b> 0 and Xn = 0 


k := 0 


where S b = {n + 1, . . . , m + n} 


{Main loop} 

if Uj > 0 for all j £ Sjy then Xk is an optimal solution — stop 
else {select entering variable} 

j* := the smallest index with itj» <0 

if dtj * < 0 for t = 1, . . . , m then the LP is unbounded — stop 
else 

t* := an index t achieving the minimum 
min{ | t = 1, . . . , m; dtj * >0} 

(if there are several such t*, make <p(t*) as small as possible) 
do a pivot with entering index j*, leaving index <p(t*) 
set component ip(t) of Xk+i to f t for t = 1 , . . . , m and the remaining com- 
ponents of Xk+i to zero 
k := k + 1 

{Continue with next iteration of main loop} 
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9. The revised simplex method is a variation of Algorithm 2. Instead of maintaining 
the entire tableau at each step only B~ x is kept. Columns of [A I] are brought in 
from storage as needed to find j* and t* . This method is good for sparse matrices A 
with many columns. 

10. A survey of 39 software packages for solving linear programming problems is de- 
scribed in [Fo97]. Virtually all of these products run on PCs. In many cases, the LP 
solvers are linked to more general modeling packages that provide a single environment 
for carrying out the formulation, solution, and analysis of LP problems. 

11. Many software packages are available to solve LP problems on mainframes and 
personal computers. Commercial packages include LINDO, CPLEX, OSL, C-WHIZ, 
and MINOS. Most use a version of the revised simplex method: 

• http://www.lindo.com/ 

• http : / / www . cplex . com/ 

• http : //www. research. ibm. com/osl/ 

• http : / / www . ketronms . com/products . html 

• http : //www-leland . Stanford . edu/~ saunders/brochure/brochure .html 

12. An extensive tabulation of software packages to solve LP problems is found at the 
site: 

• http : //www-c .mcs . anl . gov/home/ otc/Guide/SoftwareGuide 

/ Categor ies/linearprog . html 

13. Computer codes (in C, Pascal, and Fortran) that implement the simplex method 
are catalogued at the sites: 

• http : //plato . la. asu. edu/guide .html#LP 

• http : //ucsu . Colorado . edu/~xu/sof tware/lp/ 

• http : / / www . wior . uni-karlsruhe . de/Bibliothek/Title_Pagel . html 


Examples: 

1. The LP in Example 1 of §15.1.2 can be placed in the form (1) with 


A = 


/ 3 

0 



/ 5 \ 

-9 

4 

3 


12 

-6 

2 

4 

, b = 

2 

-1 

0 

0 


0 

0 

-1 

0 


0 

V 0 

0 

-lJ 


Vo/ 


c =( 3 4 4). 


If xq = (0,1, 0) T then constraints 3,4,6 are active at xq and S = [3,4,6]. Thus 


u = cB - 1 = (2 -15 4). 


is updated to S = [3, 1, 6], so B now contains rows 3, 1, 6 of A. Additional iterations of 
Algorithm 1 can then be carried out using the updated S and B. 



/-6 

2 

4 \ 


B = 

- 1 

0 

0 ’ 

B~ 


V 0 

0 

V 


Here i* = 

= 2 ,y- 


(1) 3, 0) T , 

Ay 

Then A = 

min{§ 

8 
> 3 

} = § and j* 
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2. The same LP can alternatively be solved using Algorithm 2. For illustration, suppose 
that the tableau (a) from Example 2 (§15.1.2) is given, corresponding to the BFS x\ = 
(0, 1, 0, 5, 8, 0) T and E b = {2,4,5}. Here u = (—15,0,4,0,0,2) and j* = 1 is chosen. 
The minimum ratio test gives min{|,|} = | and t* = 1. The next pivot produces 
tableau (b) in Example 2 (§15.1.2), with E b = {1,2,5} and Xi = (|, 6, 0, 0, 3, 0) T . Here 
u — (0, 0, —1, 5, 0, 2) so a further pivot is performed using j* = 3 and t* = 3, giving the 
tableau below. Since u > 0 the BFS x% = ({^, 0, 6, 0, 27, 0) T is an optimal solution to 
the LP, with optimal objective value z* = 35. 


0 

1 

0 

6 

0 

5 

2 

35 

1 

1 

3 

0 

2 

3 

0 

1 

6 

11 

3 

0 

4 

0 

3 

1 

0 

27 

0 

1 

1 

1 

0 

1 

2 

6 


15.1.4 INTERIOR POINT METHODS 

There are numerous interior point methods for solving LP problems. In contrast to 
the simplex method, which proceeds from vertex to vertex along edges of the feasible 
region, these methods move through the interior of the feasible region. In particular 
this section discusses N. Karmarkar’s “projective scaling” algorithm (1984). 

Definitions: 

The norm of x € 1Z 11 is given by ||a:|| = \/ x\ + x\ + • • • + x„. (See §6.1.4.) 

Let e denote the row vector of n Is. 

The LP problem 

minimize: z = cx 

subject to: Ax = 0 

(3) 

ex = 1 

x > 0 

is in standard form for Karmarkar’s method if - e is a feasible vector and if the 

n 

optimal objective value is z* = 0. 

The n x n diagonal matrix diag(x i, £ 2 , • • • , x n ) has diagonal entries x\, X 2 , . . . , x n . (See 
§6.3.1.) 

The unit simplex in n dimensions is S n = { x £ 7 Z n \ ex = 1, x > 0 }. 

If x is feasible to (3), Karmarkar’s centering transformation l' x : S n S n is 

rr - di ag(s) _1 a: 

e diag(x) l x 

The projection of a vector v onto the subspace X = { x £ lZ n \ Ax = 0 } is the unique 
vector p £ X for which ( v — p) T x = 0 for all x £ X. (See §6.1.4.) 

n 

Karmarkar’s potential function for (3) is f(x) = J} In (“). 
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Algorithm 3: Karmarkar’s method. 

input: LP in form (3) 

output: an optimal solution to (3) 



{Main loop} 

{test for optimality within e} 
if cxk < e then stop 

else {find new point y in transformed unit simplex} 
P . = ( ^diag(x fc )\ 

‘ V ii ••• 1 ) 

Cp := [I — P T (PP T )~ 1 P] diag(£fc) c T 

V k ■— n - ( sj n(n—l) ) IMI 

{find new feasible point in the original space} 
x k+ i := T~^{y k ) 

{Continue with next iteration of main loop} 


Facts: 

1. Any LP problem can be transformed into form (3); see [Sc86], [BaJaSh90] for details. 

2. The centering transformation is 1-1 and onto. 

3. The inverse of the centering transformation is 

rp- 1 / x diag(T) y 
e diag(x) y 

4. The transformation places x at the center of the transformed unit simplex: 
T x(x) = ^e. 

5. The transformation T ^ maps the feasible region of (3) to 

Y = {y G n n | Adiag(a?) y = 0, ey = 1, y > 0 }. 

6. W = { w £ lZ n | Adiag(a?) w = 0, ew = 0, w > 0 } is the set of all feasible directions 
for Y. 

7. The projection of v onto W is [I — P T (PP T )~ 1 P]v, where P = 

8. Karmarkar’s algorithm: This method (Algorithm 3) moves through the interior of 
the feasible region of (3), transforming the problem at each iteration to place the current 
point at the “center” of the transformed region. 

9. In Algorithm 3, e > 0 is a fixed tolerance chosen arbitrarily small. The parameter 6 
is a constant, 0 < 6 < 1, associated with convergence of the algorithm. The value 9 = \ 
ensures the convergence of Algorithm 3. 

10. There is a positive constant S with f(x k ) — f(x k+ 1 ) > S for all iterations k of 
Karmarkar’s method. To ensure this inequality diag(a:fc) c, rather than c, is projected 
onto the space of feasible directions W. 


f Adiag(x) \ 

In - l) 
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11. For large problems, Karmarkar’s method requires many fewer iterations than does 
the simplex method. 

12. Letting L be the maximum number of bits needed to represent any number asso- 
ciated with the LP problem, the running time of Karmarkar’s algorithm is polynomial , 
namely 0 (n 35 L 2 ). 

13. The earliest polynomial-time algorithm for LP problems is the ellipsoid method, 
proposed by L. G.Khachian in 1979. (See [Ch83] or [Sc86].) 

14. The ellipsoid method has worst-case complexity 0(n 6 L 2 ), where L is defined in 
Fact 12. Because its calculations require high precision, this method is very inefficient 
in practice. 

15. Karmarkar’s polynomial-time algorithm was announced in 1984 and it has proven 
to be seriously competitive with the simplex method. Typically, Karmarkar’s algorithm 
reduces the objective function by fairly significant amounts at the early iterations, often 
converging within 50 iterations regardless of the problem size. 

16. Other versions of Karmarkar’s algorithm are faster than the one described here, 
but are more complicated to explain. Efficient implementations of these faster versions 
solve some classes of large LP problems over 50 times faster than the simplex method. 

17. Computer codes (in C, Pascal, and Fortran) that implement interior point methods 
are catalogued at the sites: 

• http : //plato . la. asu. edu/guide .html#LP 

• http : //ucsu . Colorado . edu/~xu/sof tware/lp/ 

• http : / / www . wior . uni-karlsruhe . de/Bibliothek/Title_Pagel . html 

18. LP problems can be submitted online for solution by different interior point algo- 
rithms using the NEOS home page: 

• http : //www-c .mcs . anl . gov/home/ otc/Server/ 

19. An archive of technical papers and other information on interior point algorithms 
is available at the site: 

• http : //www-c .mcs . anl . gov/home/ otc/InteriorPoint/ index.html 

Example: 

1. In the following LP the vector x = (§,§,§) T is feasible and the problem has the 
optimal objective value z* = 0, achieved for x* = (0, |, |) T . 

minimize: x\ 

subject to: x\ + X 2 — 2 x 3 = 0 
X\+ X 2 + X3 = l 
X\, X 2 , x 3 >0 

Karmarkar’s algorithm is started with xq = (|, |) T , yielding cx o = |. For illustrative 

purposes the value 9 = 0.9 is used throughout. Since A = (1 1 —2) the matrix P = 

(l 1 ~l )> g ivin g c p = and y 0 = (0.0735, 0.5931, 0.3333) T = x x . The 

new objective value is cx \ = 0.0735. 
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Additional iterations of Algorithm 3 are tabulated in the following table, showing 
convergence to the optimal x* = (0, |, |) T after just a few iterations. 


k 


Xk 


0 

/ 0.3333 \ 
0.3333 
\ 0.3333 / 


1 

/ 0.0735 \ 
0.5931 
V 0.3333 / 


2 

/ 0.0056 \ 
0.6611 
V 0.3333 / 


3 4 

/ 0.0004 \ / 0.0000 \ 

0.6663 0.6666 

\ 0.3333/ \ 0.3333/ 


Vk 


cx k 


( 0.0735 \ 
0.5931 
\ 0.3333 / 
0.3333 


/ 0.0349 \ 
0.5087 
\ 0.4564 / 
0.0735 


/ 0.0333 \ 
0.4852 
\ 0.4814 / 

0.0056 


/ 0.0333 \ 
0.4835 
\ 0.4832 / 
0.0004 


/ 0.0333 \ 
0.4833 
\ 0.4833 / 
0.0000 


15.1.5 DUALITY 

Associated with every LP problem is its dual problem, which is important in devis- 
ing alternative solution procedures for the original LP. The dual also provides useful 
information for conducting postoptimality analyses on the given LP. 


Definitions: 


Associated with every LP problem is another LP problem, its dual. The original 
problem is called the primal. 


The dual of an LP in form (2) 

maximize: cx 

subject to: Ax < b 

x > 0 

is defined to be the LP 

minimize: ub 

subject to: uA > c 

u > 0 . 

The components Ui,U 2 , . . . ,u m of u are the dual variables. 


(2) 

(4) 


Facts: 

1. To find the dual of an arbitrary LP problem either transform it (§15.1.1 Fact 5) into 
form (2) or use the following table: 


primal 

dual 

maximization problem 
unrestricted variable 
nonnegative variable 
nonpositive variable 
equality constraint 
< constraint 

> constraint 

minimization problem 
equality constraint 
> constraint 

< constraint 

unrestricted variable 
nonnegative variable 
nonpositive variable 
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2 . The dual of the dual LP is the primal LP. 

3. Weak duality theorem: For any feasible solution x to the primal and any feasible 
solution u to the dual cx < ub. 

4. Strong duality theorem: If x* is an optimal solution to (2) then there exists an 
optimal solution u* for (4) and cx* = u*b. 

5. A given primal LP and its associated dual LP can only produce certain combina- 
tions of outcomes, as specified in the following table. For example, if one problem is 
unbounded then the other must be infeasible. 


primal 

dual 

optimal 

infeasible 

unbounded 

infeasible 

optimal 

unbounded 

infeasible 

infeasible 


6. Let 

x * be 

u 

z 

D 

/ ' 


an optimal BFS of the primal LP (2), with the corresponding 
Then u is an optimal BFS of the dual LP (4). 


tableau 


7 . Complementary slackness: An optimal dual (primal) variable u* (x*) can be 
nonzero only if it corresponds to a primal (dual) constraint active at x* (u*). 


8. Economic interpretation: Suppose in the LP (2) that bi is the amount of resource i 
available to a firm maximizing its profit. Then the optimal dual variable u* is the price 
the firm should be willing to pay (over and above its market price) for an extra unit of 
resource i. 


9 . Dual simplex algorithm: This approach (Algorithm 4) can be used when a basic 
solution for (2) is known that is not necessarily feasible but which has nonnegative 
reduced costs (i.e., it is a dual feasible basic solution). The main idea of the algorithm 
is to start with the dual feasible basic solution and to maintain dual feasibility at each 
pivot. An optimal BFS is found once primal feasibility is achieved. 

10 . The dual simplex method was devised in 1954 by C. E. Lemke. 

11 . Computer code (in C) that implements the dual simplex algorithm can be found 
at the site: 

• http : //ucsu . Colorado . edu/~xu/sof tware/lp/minit . html 


Examples: 

1. Using the table of Fact 1, the dual of 


maximize: 

5xi — 7x 2 






subject to: 

Xi + 3X2 “ 

- x 3 

+ 

X\ 

< 

-1 


2xi + x 2 - 

- 4x 3 

- 

X\ 

> 

3 


Xi + x 2 - 

- 3x 3 

H- 2x4 

= 

2 


B 

to 

IV 

o 

B 

o' 

VI 

Xl: 

i x 3 

unrestricted 
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Algorithm 4: Dual simplex algorithm. 

input: LP in form (2), dual feasible basic solution Xq 
output: an optimal BFS or an indication of infeasibility 

'll 

associate with xo = (xb,x jv) = (/, 0) the tableau — jy- 
k := 0 


where u > 0 


{Main loop} 

{optimality test} 

iff > 0 then Xk is an optimal solution — stop 
else 

t* := the smallest index with / t . < 0 

if d t . j > 0 for all j then the LP is infeasible — stop 

else 

j* := the smallest index attaining the maximum 
max { 2^7- | j = 1 , . . . , m + n; dfj < 0 } 
do a pivot with entering index j*, leaving index 

set component ip(t) of Xk+i to ft for t = 1 , . . . , m and the remaining com- 
ponents of Xk+i to zero 
k := k + 1 

{Continue with next iteration of main loop} 


minimize: — U\ + 3ii2 + 2u3 

subject to: tq + 2u2 + U3 = 5 

3ui + U 2 + U 3 > — 7 

—Ui — 4l(2 — 3rt3 = 0 

U\ — U 2 + 2u 3 < 0 

U\ > 0, U2 < 0, U3 unrestricted. 

2. The LP of §15.1.2 Example 1 has the dual 


minimize: 

5ui + 12it2 

+ 

2u 3 



subject to: 

3rti — 9rt2 

- 

6 u 3 

> 

3 


4m 2 

+ 

2u 3 

> 

4 


— Ul + 3 U2 

+ 

4 u 3 

> 

4 


Ml, 

W5 

!, «3 

> 

0 


The optimal solution to the primal LP (see §15.1.3 Example 2) is x* = ({7, 0, 6) T with 
optimal objective value z* = 35. The associated tableau has u = (0, 1,0, 6,0, |). The 
optimal dual variables for (4) are recovered from the reduced costs of the slack variables 
£ 4 , £ 5 , and xq, so that u* = (6,0, |). As guaranteed by Fact 4, the optimal dual 
objective value 5w{ + 12^2 + 2u 3 = 30 + 5 = 35 = z*. The complementary slackness 
conditions in Fact 7 hold here: the second primal constraint holds with strict inequality 
(£5 = 27 > 0), so the second dual variable = 0; a l S( h the second dual constraint 
holds with strict inequality (uf = 1 > 0), so the second primal variable x* 2 = 0. 
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3. Using the transformations of §15.1.1 Fact 5, the LP problem 

minimize: 2xi + ?>X 2 + 4x3 

subject to: 2xi — X 2 + 3x3 > 4 

xi + 2x 2 + X3 > 3 
Zi,x 2 ,x 3 > 0 

can be written in form (2), with the corresponding tableau (a) below. Since u > 0 the 
current solution X4 = —4, X5 = —3 is dual feasible but not primal feasible. Algorithm 4 
can then be applied, giving t* = 1 and j* = 1. The variable X 4 leaves the basis and 
the variable Xi enters, giving tableau (b) and the new basic but not feasible solution 
x = (2, 0, 0, 0, — 1) T with z = 4. One additional dual simplex pivot achieves primal 
feasibility and produces the optimal solution x* = ( l 1 , |,0,0, 0) T with z* = 


tableau (a) tableau (b) 


2 

3 

4 

0 

0 

0 

0 

4 

1 

1 

0 

-4 

-2 

1 

-3 

1 

0 

-4 

1 

1 

2 

3 

2 

1 

2 

0 

2 

-1 

-2 

-1 

0 

1 

-3 

0 

5 

2 

1 

2 

1 

2 

1 

-1 


15.1.6 SENSITIVITY ANALYSIS 


Since the data to an LP are often estimates or can vary over time, the analysis of many 
problems requires studying the behavior of the optimal LP solution to changes in the 
input data. This form of sensitivity analysis typically uses the solution of the original 
LP as a starting point for solving the altered LP. 


Definitions: 

The original tableau for the LP problem (2) is 


The final tableau for the optimal basic solution x* (possibly after a permutation of 
the columns 1 , ,m + ri) is 


U 

Z 

0 

cbB 1 N — cn 

cbB 1 6 

D 

f 

/ 

ir'N 

~B =T b 


Row 0 of a tableau refers to the row u of associated dual variables. 

A tableau is suboptimal if some entries of row 0 are negative. A tableau is infeasible 
if some entries of column / are negative. 

Let a- 7 be the column of [ A f] associated with variable Xj and let dP be the column 
of D associated with variable Xj. 


Facts: 


1. The formulas in Table 1 show how to construct an updated tableau T' from the 
final tableau T of an LP problem: 

• if T' is suboptimal, reoptimize using the simplex method starting with T 7 ; 

• if T 7 is infeasible, reoptimize using the dual simplex method starting with T 7 ; 

• otherwise, T 7 corresponds to an optimal BFS for the altered problem. 
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Table 1 Formulas for constructing the updated tableau T' . 


change in 

LP data 

possible changes 
in tableau 

tableau updates 

change in c s , s nonbasic: 
c s := c s + A 

only entry s of 
row 0 can change 

u s := u s - A 

change in c s , s basic: 

Cs := c s + A 

row 0 and z 
can change 

update cb 

Uj :— cbB- 1 ^ —Cj, j nonbasic 

2 := CBB~ 1 b 

change in b r : 
b r := b r -J- zX 

decision variables 
and z can change 

f:=f + A(d n + r ) 
update b 
z := CBB~ 1 b 

change nonbasic column s: 
cl s := d s 

C s • — C s 

tableau column s and 
u s can change 

update a s and c s 
u s := c B B~ l a s - c s 
d s := B~ 1 a s 

add a new column 
a 1 with cost q 

new tableau column £ 
and new U£ 

U£ := c b B ~ 1 a e - ce 
d e := B^a 1 


2. Ranging: Table 2 shows how to calculate the (maximal) ranges over which the 
current basis B remains optimal. In the “range” column of Table 2, b and cb refer to 
entries of T, rather than T' . 

3. When is changed within the allowable range (Table 2), the change in the objective 
value is — A times the reduced cost of the slack variable associated with row i. 

4. To add a new constraint a^x < b g to the original LP do the following: 

• add a new (identity) column to the tableau corresponding to the slack variable 

of the new constraint; 

• add a new row t to the tableau corresponding to the new constraint; 

• for each basic j with df 3 ^ 0, multiply row i = </? _1 (j) by —dy and add to row t\ 

• if the updated fe < 0 use the dual simplex method to reoptimize. 

5. For changes in more than one component of c, or in more than one right-hand side b , 
use the “100% rule”: 

• objective function changes: If all changes occur in variables j with Uj >0, the 

current solution remains optimal as long as each Cj is within its allowable range 
(Table 2). Otherwise, let Ac, be the change to Cj. If A Cj > 0 set r 7 - := , 

else set r 3 := — where Ajj,Al are computed from Table 2. If rj < 1, 
the current solution remains optimal (if not, the rule tells nothing). 

• right-hand side changes: If all changes are in constraints not active at x*, the 

current basis remains optimal as long as each bi is within its allowable range 
(Table 2). Otherwise, let Abi be the change to bi. If A 6* > 0 set r* := else 
set Vi := — where A U: A L are computed from Table 2. If Yh r i — 1> the 
current solution remains optimal (if not, the rule tells nothing). 


© 2000 by CRC Press LLC 


Table 2 Ranges over which current basis is optimal. 


change in LP data 

range 

change in c s , s nonbasic: 
c s := c s + A 

A < u s 

change in c s , s basic: 

Cs := c s + A 

Al < A < Ajj, where p = sth row of B _1 

Al = max { c °~ C paj — — pa^ > 0, j nonbasic } 

Ajj = min { Cj ~ c ^ — — pa i < 0, j nonbasic } 

change in b r : 

b r ;= b r + A 

Ab < A < Ajj, where q = rth column of B -1 

A l = max { ( q , )! | qi > 0 } 

A L , = mm { ( qi | qj < 0 } 


Examples: 

1. The LP problem 

maximize: 3xi + 4 x 2 + 4 x 3 
subject to: 3xi — X 3 < 5 
— 9xi + 4x2 + 3x3 < 12 
— 6x1 + 2^2 + 4^3 < 2 
Xi,x 2 ,x 3 > 0 

has the final tableau T 


0 

1 

0 

6 

0 

5 

2 

35 

1 

1 

3 

0 

2 

3 

0 

1 

6 

11 

3 

0 

4 

0 

3 

1 

0 

27 

0 

1 

1 

1 

0 

1 

2 

6 


corresponding to the optimal BFS x* = (^,0, 6 , 0, 27, 0) T with z* = 35. The associated 
basis matrix B contains columns 1, 5, 3 and the inverse basis matrix is B _1 , where 

/ 3 0 - 1 \ /f 0 

B= -9 1 3 , B~ 1 = 3 1 0 . 

V -6 04/ \l 0 \J 

If the nonbasic objective coefficient C 2 is changed to 4 + A, the current BFS remains 
optimal for A < 112 = 1, that is for C 2 < 5. If the basic objective coefficient ci is changed 
to 3 + A, then p = (|,0, |) and Al = max {©|, ^|, -^j^} = —3. This gives —3 < A, 
so the current BFS remains optimal over the range ci > 0. If however C 3 is changed to 
the value 2, meaning A = —2, the current basis with Eg = {1,5,3} will no longer be 
optimal. Using Table 2 the vector cb is updated to cb = (3,0,2) and the nonbasic Uj 
are computed as U 2 = — 1, «4 = 4, uq = |. The updated u = (0, — 1, 0, 4, 0, |) and 
z = 23 are inserted in tableau T. Since 112 < 0 a simplex pivot with j* = 2 and t* = 3 
is performed, leading to the new optimal solution x* = (|, 6 , 0, 0, 3, 0) T with z* = 29. 
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2 . Suppose that the right-hand side b\ in the original LP of Example 1 is changed 
to bi = 3, corresponding to the change A = —2. From Table 2 / is updated to 
(^y, 27, 6 ) t — 2(|, 3, 1) T = (|, 21,4) t , giving the optimal BFS x* = (|,0, 4, 0, 21, 0) T . 
Since b = (3,12,2) T the objective value found from Table 2 is z = 23. Notice that the 
change in objective value is A z = 23 — 35 = —12, which is the same as —A times the 
reduced cost —u^ of x 4: namely, —12 = 2 • (—6). To determine the range of variation 
of b\ so that the basis defined by T,b = {1,5,3} remains unchanged, Table 3 is used. 
Here q = (|,3,1) T and A L = max {— f} = — Thus -y < A so that 
the current basis is optimal for b\ > — \ . 

3 . If the new constraint 2>X\ + 2x2 — £3 < 4 is added to the LP in Example 1, the 
(previous) optimal solution x* = ({y , 0, 6, 0, 27, 0) T is no longer feasible. Using Fact 4, 
a new row and column are added to the tableau T, giving tableau (a) below. By 
adding (—3) times row 1 and +1 times row 3 to the last row, a new tableau (b) is 
produced corresponding to the basic set Tib = {1, 5, 3, 7}. Since 64 < 0 the dual simplex 
algorithm is then used with t* = 4 and j* = 4, producing a new tableau that is primal 
feasible, with the new optimal BFS (3, 0, 5, 1, 24, 0, 0) T and objective value 29. 


tableau (a) tableau (b) 


0 

1 

0 

6 

0 

5 

2 

0 

35 

0 

1 

0 

6 

0 

5 

2 

0 

35 

1 

1 

3 

0 

2 

3 

0 

1 

6 

0 

11 

3 

1 

1 

3 

0 

2 

3 

0 

1 

6 

0 

11 

3 

0 

4 

0 

3 

1 

0 

0 

27 

0 

4 

0 

3 

1 

0 

0 

27 

0 

1 

1 

1 

0 

1 

2 

0 

6 

0 

1 

1 

1 

0 

1 

2 

0 

6 

3 

2 

-1 

0 

0 

0 

1 

4 

0 

2 

0 

-1 

0 

0 

1 

-1 


4 . One example of the practical use of sensitivity analysis occurred in the airline indus- 
try. When the price of aviation fuel was relatively high, and varied by airport location, 
a linear programming model was successfully used to determine an optimal strategy for 
refueling aircraft. The key idea is that it might be more economical to take on extra fuel 
at an enroute stop if the fuel cost savings for the remainder of the flight are greater than 
the extra fuel burned because of the excess weight of additional fuel. A linear program- 
ming model of this situation ended up saving millions of dollars annually. An important 
feature was providing pilots with ranges of fuel prices for each airport location, with 
associated optimal policies for taking on extra fuel based on the cost range. 

5 . Another example of the beneficial use of sensitivity analysis occurred in a 1997 study 
to assess the effectiveness of mandatory minimum-length sentences for reducing drug 
use. One finding of the study was that the longer sentences become more effective than 
conventional enforcement only when it costs more than $30,000 to arrest a drug dealer. 
Thus, rather than producing a single optimal policy, this study identified conditions 
(parameter ranges) under which each alternative policy is to be preferred. 


15.1.7 GOAL PROGRAMMING 

Goal programming refers to a multicriteria decision-making problem in which a given 
LP problem can have multiple objectives or goals. This technique is useful when it is 
impossible to satisfy all goals simultaneously. For example, a model for optimizing the 
operation of an oil refinery might seek not only to minimize production cost, but also 
to reduce the amount of imported crude oil and the amount of oil having a high sulfur 
content. In another instance, the routing of hazardous waste might consider minimizing 
not only the total distance traveled but also the number of residents living within ten 
miles of the selected route. 
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Definitions: 

A goal programming ( GP ) problem has linear constraints that can be written 

Ax < b 
Hx + x — x = h 
£ > 0, a: > 0, x >0 

and objective functions 

G i : minimize z\ = C\X\ + d\X\ 

Gi : minimize Zi = C 2 X 2 + ^ 2^2 

Gi : minimize ze = C(Xi + dfxi 

where A is an m x n matrix and H is an £ x n matrix. 

The value hk is the target value of the fcth goal. Goal k is satisfied if ( Hx)k = hk 
holds for a given vector x of decision variables. 

The variables x are the underachievement variables while the variables x are the 

overachievement variables. 

Facts: 

1. In a GP problem, the aim is to find decision variables that approximately satisfy 
the given goals, which is achieved by jointly minimizing the magnitudes of the under- 
achievement and overachievement variables. 

2. Assuming c& > 0 and df c > 0, then goal k is satisfied by making Zk = 0. 

3. If all Ct and dt are positive then for each k = 1, ... ,1! at most one of Xk,Xk will be 
positive in an optimal solution. 

4. One important case of a GP problem has Ck = dk = 1 for k = 1, . . . , f, making the 
objective to (approximately) satisfy all constraints Hx = h. 

5. When the relative importance of G±, . . ■ , Gt is known precisely, an ordinary LP can 
be used with the objective function being a weighted sum of Z \, . . . , zg. 

6. Preemptive goal programming: Here the goals are prioritized G\ G 2 Gi, 

meaning that goal G\ is the most important and goal G( is the least important. Solutions 
are sought that satisfy the most important goal. Among all such solutions, those are 
retained that best satisfy the second highest goal, and so forth. 

7. Goal programming simplex method: The simplex method (§15.1.3, Algorithm 2) 
can be extended to preemptive GP (minimization) problems, with the following modi- 
fications: 

• £ “objective rows” are maintained in the tableau instead of just one. 

• Let i* be the highest-priority index with Z{* > 0 for which there exists a non- 

basic j* with if' » > 0 and with n* * > 0 for any (higher-priority) objective row 
i < i*. If there is no such i * , stop; else pick j* corresponding to the most 
positive «*•». 

• All £ objective rows are updated when a pivot is performed. 

• At completion, if the solution fails to satisfy all goals, then every nonbasic vari- 

able that would decrease the objective value Zi if it entered the basis, would 
increase zr for some higher-priority goal Gr, i! < i. 
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8. Computer codes (in C, Pascal, and Fortran) that implement goal programming are 
available at the sites: 

• ftp : // garbo . uwasa . f i/pc/ts/tslin35c . zip 

• http : //www. iiasa. ac . at/~marek/ soft/descr .html#MCMA 


15.1.8 INTEGER PROGRAMMING 

Integer programming problems are LPs in which some of the variables are constrained 
to be integers. Such problems more accurately model a wide range of application areas, 
including capital budgeting, facility location, manufacturing, scheduling, logical infer- 
ence, physics, engineering design, environmental economics, and VLSI circuit design. 
However, integer programming problems are much more difficult to solve than LPs. 

Definitions: 

Let Z n [Z"] denote the set of all n-vectors with all components integers [nonnegative 
integers], and let 72" [72."] denote the set of all n-vectors with all components real 
numbers [nonnegative real numbers]. 

A pure integer programming (IP) problem is an optimization problem of the form 

maximize: Zip = cx 

subject to: Ax < b (5) 

x£Z™ 

where A is an m x n matrix, b is an m-column vector, and c is an n-row vector. 

A 0-1 IP problem is an IP with each Xj £ {0, 1}. 

A mixed integer programming ( MIP ) problem is of the form 

maximize: -m/p = cx + hy 

subject to: Ax + Gy < b 

y £ n p + 

where A is an m x n matrix, G is an m x p matrix, b is an m-column vector, c is an 
n-row vector, and h is a p-row vector. 

For IP problem (5), the feasible region is S = { x £ Z r f \ Ax < b}. 

A polyhedron is a set of points in 72" satisfying a finite set of linear inequalities. 

If X is a finite set of points in 72", the convex hull of X is 

conv ( X ) = { Y, A iXi | Xi £ X, J2 A, = 1, A * > 0 }. 

The LP relaxation of (5) is the linear programming problem 

maximize: Zlp = cx 

subject to: Ax < b (6) 

x > 0 . 

More generally, a relaxation of (5) is any problem max {or | x £ Tj, where S C T. 

The problem max { cx \ Ax <b,x£ Z " } is a formulation for (5) if it contains exactly 
the same set of feasible integral points as (5). If the feasible region of the LP relaxation 
of one formulation is strictly contained in another, the first is a tighter formulation. 
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Algorithm5: Cutting plane algorithm for (5). 

input: IP in form (5) 

output: an optimal solution x* with objective value z* 

let R be the LP relaxation: max { cx \ Ax < b, x > 0 } 

{Main loop} 

optimally solve problem R , obtaining x 

if x £ Z" then stop with x* := x and 2 * := cx 

else 

find a cutting plane ttx < ir 0 with nx > t r 0 and ttx < ir 0 for all feasible solutions 
of (5) 

modify R by adding the constraint ttx < ttq 
{Continue with next iteration of main loop} 


Suppose a; is a feasible solution to (6) but not to (5). A cutting plane is any inequality 
nx < no satisfied by all points in conv (S) but not by x. 

A family S of subsets of S' is a separation of S if (Js^gs S& = S; a separation is usually 
a partition of the set S. 

A lower bound z for zjp is an underestimate of zip. 

Facts: 

1. zjp < Zpp. More generally, any relaxation of (5) has an optimal objective value at 
least as large as zip. 

2. If x' is feasible to (5) then z' = ex' satisfies zJ < zjp. 

3. The feasible region of an LP problem is a polyhedron, and every polyhedron is the 
feasible region of some LP problem. 

4. The set conv (S) is a polyhedron, so there is an LP problem max { cx \ Ax < b, x > 
0} with the feasible region conv (S). 

5. An optimal solution to the LP in Fact 4 is an optimal solution to (5). However, 
finding all necessary constraints, called facets, of this LP is extremely difficult. 

6. IP is an NP-hard optimization problem (§16.5.2). Consequently, such problems are 
harder to solve in practice than LPs. The inherent complexity of solving IPs stems from 
the nonconvexity of their feasible region, which makes it difficult to verify the optimality 
of a proposed optimal solution in an efficient manner. 

7. Formulation of an IP is critical: achieving problem tightness is more important than 
reducing the number of constraints or variables appearing in the formulation. 

8. Solution techniques for (5) usually involve some preliminary operations that improve 
the formulation, called preprocessing, followed by an iterative use of heuristics (§10.7) 
to quickly find feasible solutions. 

9. Popular solution techniques for solving (5) include cutting plane methods (Fact 10), 
branch and bound techniques (Fact 12), and (hybrid) branch and cut methods. 

10. Cutting plane method: This approach (Algorithm 5) proceeds by first finding an 
optimal solution a: to a relaxation R of the original problem (5). If x is not optimal, a 
cutting plane is added to the constraints of the current relaxation and the new LP is 
then solved. This process is repeated until an optimal solution is found. 
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11 . General methods for finding cutting planes for IP or MIP problems are relatively 
slow. Cutting plane algorithms using facets for specific classes of IP problems are better, 
since facets make the “deepest” cuts. 

12 . Branch and bound method : This approach (Algorithm 6) decomposes the original 
problem P into subproblems or nodes by breaking S into subsets. Each subproblem Pj 
is implicitly investigated (and possibly discarded) until an optimal one is found. In 
this algorithm z* is the optimal value of problem Pj, ~Zj is the optimal value of the 
relaxation Rj of Pj, and Zj is the best known lower bound for z*. 

13 . In Algorithm 6 the optimal value zj of relaxation Rj is an upper bound for z* . 
Also zq is the objective function value for the best known feasible solution to (5). 

14 . LP relaxations are often used in the bounding portion of Algorithm 6. 

15. There are specializations of Algorithm 6 for 0-1 IP problems and for MIP problems. 

16 . Branch and bound tends to be a computationally expensive solution method. Usu- 
ally it is applied only when other methods appear to be stalling. 

17. In the survey [Fo97] of linear programming software, several of the packages listed 
will handle IP problems as well. When available, these extensions to binary and/or 
integer-valued variables are indicated by the survey. 

18 . There are several commercial software packages that solve IP and MIP problems, 
such as CPLEX, OSL, MIPIII, XPRESS-MP, XA, and LINDO: 

• http : / / www . cplex . com/ 

• http : //www. research. ibm. com/osl/ 

• http : //www.ketronms . com/mipiii .html 

• http : / / www . dash . co . uk/ 

• http://www.sunsetsoft.com/ 

• http://www.lindo.com/ 

19 . An extensive tabulation of software packages to solve IP and MIP problems is 
found at the site: 

• http : //www-c .mcs . anl .gov/home/ otc/Guide/SoftwareGuide 

/ Categories/ intprog . html 

20 . Computer codes (in C and Pascal) for solving IP and MIP problems are available 
at the sites: 

• http : / / www . mpi-sb . mpg . de/~barth/opbdp/ opbdp . html 

• http : //www. iiasa. ac . at/~marek/ soft/descr ,html#MOMIP 

• http : //www. net cologne . de/~nc-weidenma/readme .htm 

Examples: 

1 . The following figure shows the convex hull of feasible solutions to the IP 

maximize: —x\ + \x 2 

subject to: x 2 < 4 

—X\ — x 2 <—\ 

8a:i + x 2 < 24 
—3a;i + 4 x’2 < 10 
X\, x 2 > 0 

x\, x 2 integers. 
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Algorithm 6: Branch and bound algorithm for (5). 

input: IP in form (5) 

output: an optimal solution Xq with objective value zq 

let P be the problem: max { cx | x £ S } 

P 0 := P; S 0 := S; zq := -oo; zq := +oo 
put Pq on the list of live nodes 
{branching} 

if no live node exists then go to {termination} 
else select a live node Pj 
{bounding} 

solve a relaxation Rj of Pj 

if Rj is infeasible then discard Pj and go to {branching} 
if Zj = +oo then go to {separation} 

{zj is finite} 

if Zj < zq then discard node Pj and go to {branching} 
if 1] = Zj then update zq := max [zq. Zj} and discard any node Pi for 
which zl < Zq 
{separation} 

choose a separation S* of Sj forming new live nodes and go to {branching} 
{termination} 

if zo = — oo then problem (5) is infeasible 

if zq is finite then z o is the optimal objective value and the associated Xq is 
an optimal solution 


feasible region 
of LP relaxation 


vex hull of feasible 
solutions to IP 


Here S = {(1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (2, 4), (3, 0)}. The optimal solution occurs 
at (1,3), with zip = The feasible region of the LP relaxation is also shown in 
Figure 3, with the optimal LP value zlp = § attained at (0, §); a cutting plane is shown 
as the dashed line. For this problem, conv (S) is defined by the following constraints: 

—X\ — X2 < —3 
— X\ + X2 < 2 

— Xi + <— 1 

4a; i + X 2 < 12 
X2 < 4 

Xi, X2 > 0 . 

All of these constraints are facets except for X 2 < 4 and the nonnegativity constraints, 
which are redundant. 
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2. The following IP has the feasible region S = {(0, 0), (1, 1), (2, 2), (3, 3), (4, 4)} and 
the optimal solution occurs at (4,4) with zip = 8 : 

maximize: X\ + X2 

subject to: 2x\ — 2x2 < 1 
— 7x\ + 8x2 < 4 
£ 1 , x 2 > 0 
X\, X2 integers. 

The LP relaxation has a feasible region defined by vertices (0, 0), (|, 0), (0, |), ( 8 , ^), 
so its optimal solution occurs at ( 8 , with Consequently, the LP solution 

is a poor approximation to the optimal IP solution. Moreover, simply rounding the LP 
solution gives either ( 8 , 7) or ( 8 , 8 ), both of which are infeasible to the given IP problem. 

3. The following IP can be solved using Algorithm 6 : 

maximize: 3xi + 8x2 — 8x3 

subject to: — 3a; 1 + 6x2 + 7x3 < 8 
6 x 1 — 3x 2 + 7x3 < 8 

Xi, x 2 , X3 > 0 

x\, X2, X3 integers. 

The initial problem Po has an LP relaxation Rq that is obtained by removing the integer 
restrictions; solving this LP gives x = (2.667, 2.667, 0) with z = 16. A separation is 
achieved by creating the two subproblems P\ and P 2 ; the constraint X\ < 2 is appended 
to P 0 creating Pi while the constraint X\ > 3 is appended to P 0 creating P 2 . Now 
the live nodes are Pi and P 2 . Solving the LP relaxation Pi gives x = (2,2.333,0) with 
2 ; = 13. New subproblems P 3 and P 4 are obtained from Pi by appending the constraints 
X2 < 2 and X2 > 3, respectively. Now the live nodes are subproblems P 2 ,P 3 ,Pi. The 
LP relaxation P 2 of P 2 is infeasible, as is the LP relaxation P 4 of P 4 . Solving the LP 
relaxation P 3 gives the feasible integer solution x = (2,2,0) with 2 : = 12. Since there 
are no more live nodes, this represents the optimal solution to the stated problem. 

4. Fixed-charge problems: Find optimal levels of n activities to satisfy m constraints 
while minimizing total cost. Each activity j has per unit cost Cj. In addition, there is 
a startup cost dj for certain undertaken activities j. 

This problem can be modeled as a MIP problem, with a real variable Xj for the 
level of each activity j. If activity j has a startup cost, introduce the additional 0-1 
variable yj, equal to 1 when Xj > 0 and 0 otherwise. For example, this condition can 
be enforced by imposing the constraints Mjyj > Xj, yj £ { 0 , 1 }, where Mj is a known 
upper bound on the value of Xj . The objective is then to minimize z = cx + dy. 

5. Queens problem : On an n x n chessboard, the task is to place as many nontaking 
queens as possible. 

This problem can be formulated as a 0-1 IP problem, having binary variables x. t j . 
Here Xij = 1 if and only if a queen is placed in row i and column j of the chessboard. 
The objective function is to maximize z = JA JA x t j and there is a constraint for 
each row, column, and diagonal of the chessboard. Such a constraint has the form 
pqo Xq < 1, where S is the set of entries in the row, column, or diagonal. For 
example, one optimal solution of this IP for the 7x7 chessboard is the assignment 
xie = X24 = X37 = £41 = £53 = £65 = £72 = 1 , with all other = 0 . 
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15.2 LOCATION THEORY 


Location theory is concerned with locating a fixed number of facilities at points in some 
space. The facilities provide a service (or product) to the customers whose locations 
and levels of demand (for the service) are known. The object is to find locations for 
the facilities to optimize some specified criterion, e.g., the cost of providing the service. 
Interest in location theory has grown very rapidly because of its variety of applications 
to such fields as operations research, city planning, geography, economics, electrical 
engineering, and computer science. 


15.2.1 p-MEDIAN AND p-CENTER PROBLEMS 
Definitions: 

A metric space is a space S consisting of a set of points with a real-valued function 
d(x,y) defined on all pairs of points x,y € S with the following properties (§6.1.4): 

• d(x, y) = d(y, x) > 0 for all x,y £ S 

• d(x, y) = 0 if and only if x = y 

• d(x, z ) < d(x , y) + d(y, z) for all x,y,z £ S. 

The value d(x, y) is called the distance between points x,y € S. 

There are p facilities that are to be located at some set X p = {xi,X 2 , ■ ■ ■ , x p } of p points 
in the (metric) space S. The elements of X p are the facility locations. 

The facilities are to provide a service to the customers whose positions are given by a 
set V = {i’i,V2, • • • , v n } of n points in S. The points in V are the demand points and 
the level of demand at Vi £ V is given by w(vi) > 0. 

For x £ S and X p C S, let d(x, X. p ) be the minimum distance from a; to a point of X p : 
d(x,X p )= min {d(x, a:,:)}. 

XiGXp 

Suppose X p is a candidate set of points in S for locating the p facilities. The following 
two objective functions are defined on X p C S: 

n 

• F(X P ) = X) w(vi)d(vi,X p ); 

i = 1 

• H(X p ) = max {w(vi)d(vi,X p )}. 

l<i<n 

X™ C S is a p-median if F(X p l ) < F(X p ) for all possible X p C S. 

X p C S' is a p-center if H{X p ) < H{X p ) for all possible X p C S. 

Facts: 

1. It is customary to assume that the demand at v t is satisfied by its closest facility. 
Then w(vi)d(vi, X p ) indicates the total (transportation) cost associated with having the 
demand at Vi £ S satisfied by its closest facility in the candidate set X p . 

2. F(X P ) represents the total transportation cost of satisfying the demands if the 
facilities are located at X p . 

3. H(X p ) represents the cost (or unfairness) associated with a farthest demand point 
not being in close proximity to any facility. 
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4. A /> median formulation is designed for locating p facilities in a manner than mini- 
mizes the average cost of serving the customers. 

5. A p- center formulation is designed for locating p emergency facilities (police, fire, 
and ambulance services), in which the maximum time to respond to an emergency is to 
be made as small as possible. 

6. The p-median and p-center problems are only interesting if p < n; otherwise, it is 
possible to locate at least one facility at each demand point, thereby reducing F(X p ) 
or H(X p ) to 0. 

Examples: 

1 . Suppose that a single warehouse is to be located in a way to service n retail outlets 
at minimum cost. Here w(vi) is the number of shipments made per week to the outlet 
at location V{. This can be modeled as a 1-median problem, since the objective is to 
locate the single warehouse to minimize the total distance traveled by delivery vehicles. 

2. A new police station is to be located within a portion of a city to serve residents of 
that area. Neighborhoods in that area can be taken as the demand points, and locating 
the police station can be formulated as a 1-center problem. Here, the maximum distance 
from the source of an emergency is critical so the police station should be located to 
minimize the maximum distance from a neighborhood. The weights at each demand 
point might be taken to be equal, or in some situations differing weights could signify 
conversion factors that translate distance into some other measure such as the value of 
residents’ time. 

3. Statistics: Suppose that n given data values Xi,X 2 , ■ ■ ■ ,x n are viewed as points 
placed along the real line. If the distance between points Xi and Xj is their absolute 
difference \xi~ Xj\, then a 1-median of this set of points (unweighted customer locations) 
is a point (facility) x that minimizes \xi — x\. In fact, x corresponds to a median 
of the n data values. If the distance between points Xi and Xj is their squared difference 
(xi — Xj) 2 , then a 1-median is a point x minimizing (xi — x) 2 , which is precisely 
the mean of the n data values. Alternatively, for either distance measure the 1-center of 
this set of points turns out to correspond to the midrange of the data set: namely, the 
1-center is the point located halfway between the largest and the smallest data values. 


15.2.2 p-MEDIANS AND p-CENTERS ON NETWORKS 
Definitions: 

A network is a weighted graph G = (V. E) with vertex set V = {iq, iq, . . . , v n } and 
edge set E, where m = \E[, see §8.1.1. 

The weight of vertex v £ V represents the demand at v and is denoted by w(v) > 0. 

The length of edge e £ E represents the cost of travel (or distance) across e and is 
denoted by £(e) > 0. Each edge is assumed to be a line segment joining its end vertices. 

A point on a network G is any point along any edge of G. The precise location of 
the point x on edge e = ( u , v) is indicated by the distance of x from u or v. 

If x and y are any two points on G , the distance d(x, y) is the length of a shortest path 
between x and y in G, where the length of a path is the sum of the lengths of the edges 
(or partial edges) in the path. 


© 2000 by CRC Press LLC 



Facts: 

1. Network G with the above definition of distance constitutes a metric space (§15.2.1). 

2. p-median theorem: Given a positive integer p and a network G = (V,E), there 
exists a set of p vertices V p C V such that F(V P ) < F(X P ) for all possible sets X p of p 
points on G. That is, a p-median can always be found that consists entirely of vertices. 

3. A p-median of a network G can be found by a finite search through all possible (™) 
choices of p vertices out of n. This is still a formidable task, but if p is a small number 
(say p < 5) it is certainly manageable. 

4. The p-median theorem also holds if the cost of satisfying the demand at v t £ V 

is fi(d(vi, X p )), instead of w{vi) ■ d(vi,X p ), provided that fi'-TZ + — > 1Z + is a concave 
nondecreasing function for all i = 1,2 ,...,n. ( 1Z + denotes the set of nonnegative 

real numbers.) In this case, the objective function for the p-median problem becomes 
F(X P ) = EtiMd(v i ,X p )). 

5 . Each point a; of a p-center X p of network G = (V,E) is a point on some edge e such 
that for some pair of distinct vertices u and v € V, w(u)d(u, x) = w(v)d(v,x); i.e., the 
point x is the “center” of a shortest path from u to v in G that passes through edge e. 

6. There are at most n 2 predetermined choices of points on each edge of G that could 
be potential points in X p \ thus there are n 2 m predetermined choices for points in X p . 

7 . A p-center X° of network G can be found by examining all possible ( n p ' n ) choices 
of p points out of n 2 m. Even for small values of p, this is a formidable task. 

Examples: 

1. In the following network the levels of demand are given at the vertices and the 
lengths of the edges are shown on the edges. The 1-median of this network is at the 
vertex labeled x\. The total transportation cost is F({xi}) = 35. 



2 . A tree network T is shown in the following figure, with the vertex demands and edge 
lengths displayed. 



A 2-median of T is the set of vertices X 2 = {xi,x 2 }, with total transportation cost 
F(X 2 ) = 25. If x\ is kept fixed and t is an arbitrary point along edge e, then {x±,t} 
also constitutes a 2-median of T . This is consistent with Fact 2, which only states that 
there is a p-median that is a subset of V, not that every p-median occurs in this way. 
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3 . A 1-center Xf is found for the network in part (a) of the following figure. For 
illustration, suppose that the 1-center is along the edge (ui,U 2 ), thereby limiting the 
search to candidate points on this edge. Let X(x) be an arbitrary point along edge 
(ui,uz), parametrized by the scalar x = d(ui,X(x)). Note that x € [0,3] with X(0) = 
u\ and A (3) = un. 

Part (b) of the figure shows plots of w(ui)d(ui,X( x)) as a function of x for i = 
1, 2, . . . , 4. The plot of D(x) = maxi < 4 <4 w(ui)d(ui,X(x)) is indicated in bold and D(x) 
assumes its minimum value when x = The 1-center of the network in part (a) is 
then located along edge ( 111 , 112 ) a distance of | from u\. Note that for this value 
of x, w(u 4 )d(u 4 , X(x)) = w(u 3 )d(u 3 ,X(x)), consistent with Fact 5. In general, X f of a 
network is not necessarily a unique point; however, here X f is unique and H(X f) = '1? . 



w(u 4 )d(u 4 , X(x)) 



(b) Plots of w(Ui)d(Uj, X(x)) for 1< i < 4. 

4 . Transportation planners are trying to decide where to locate a single school bus stop 
along a major highway. Situated along a one-mile stretch of the highway are 8 commu- 
nities. The table below gives the number of school-age students in each community who 
ride the bus on a daily basis. The distance of each community from the westernmost 
edge of the one- mile stretch of highway is also shown. 


community 

1 

2 

3 

4 

5 

6 

7 

8 

# students 

9 

4 

8 

11 

5 

3 

5 

11 

distance (mi) 

0.0 

0.2 

0.3 

0.4 

0.6 

0.7 

0.9 

1.0 
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The data of this problem can be represented by an undirected path (§8.1.3) with each 
vertex v corresponding to a community and w(v) being the number of students from that 
community riding the bus. Edges join adjacent communities and have a length given 
by the difference in distance entries from the table. To minimize the total (weighted) 
distance traveled by the students, a 1-median is sought. By Fact 2, only vertex locations 
need to be considered. For example, situating the bus stop at vertex 3 incurs a cost of 
F({3}) = 9(0.3) +4(0.1) + 8(0) + 11(0.1) + 5(0.3) +3(0.4) + 5(0.6) + 11(0.7) = 17.6. The 
minimum cost is incurred for vertex 4, with +({4}) = 16.3, so that the bus stop should 
be located at community 4. 


15.2.3 ALGORITHMS FOR LOCATION ON NETWORKS 

Algorithms for finding p-medians and p-centers of a network G can be devised that are 
feasible for small values of p. For general p, however, there are no efficient methods 
known for arbitrary networks G. Specialized (and efficient) algorithms are available 
when G is a tree. 

Definitions: 

Let T be a tree (§9.1.1) with vertex weights w(v). 

If T' is a subnetwork of T, define the total weight of T' by W(T') = w(v). 

vev(T') 

For v G V(T), let T vl , T v2 , ■ . ■ , T vd(v) be the components of T — v, where d(y) is the 
degree (§8.1.1) of vertex v. Define M v = max {W(T vi )}. 

A leaf vertex of T is a vertex of degree 1. 

Facts: 

1. The fastest known algorithms for the p- median and p-center problems on a network 
G = ( V,E ) with n vertices and m edges have worst-case complexities 0(n p+1 ) and 
0(m p n p (\ogn) 2 ), respectively [Ta88]. 

2. If p is an independent input variable (i.e., p could grow with n), then both the 
p-center and p-median problems are NP-hard [KaHa79a, KaHa79b]. Thus it is highly 
unlikely that an algorithm will be found with running time polynomial in n, to, and p. 

3. Considerable success has been reported in solving large p-median problems by heuris- 
tic methods that do not necessarily guarantee optimal solutions. The best known such 
procedure is a dual-based integer programming approach due to Erlenkotter [Er78]. 

4. Algorithms of complexities 0(n 2 p) and O(nlogn) for the p-median and p-center 
problems on tree networks have been reported by Tamir [Ta96] and by Frederickson 
and Johnson [MiFr90, Chapter 7]. 

5. Vertex u is a 1-median of a tree network T if and only if M u < ^W(T). 

6. 1-median of a tree : This algorithm (Algorithm 1) is based on Fact 5. The main 
idea is to repeatedly remove a leaf vertex, confining the problem to a smaller tree T' . 

7. Algorithm 1 can be implemented to run in O(n) time. 

8. Let T be a tree network with w(v) = c for all v G V(T). Then the 1-center of T is 
the unique middle point of a longest path in T. 

9 . Select any vertex vo in a tree network T. Let V\ be a farthest vertex from vq, and 
let V2 be a farthest vertex from 'iq . Then the path from V\ to V2 is a longest path in T. 
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Algorithm 1: 1-median of a tree. 

input: tree T 
output: 1-median v 

T' := T; W 0 := J 2 vev(T') w ( v )i w ( v ) : = w ( v ) for each v e V'(r') 
{Main loop} 

if T' consists of a single vertex v then stop 
else 

v := a leaf vertex of T' 
if W(v) > \ Wq then stop 

else 

u := the vertex adjacent to v in T' 

W(u) := W{u) + W(v) 

T' := T' - v 

{Continue with next iteration of main loop} 


Algorithm 2: 1-center of an unweighted tree. 

input: tree T with w{v) = c for all v € V(T) 
output: 1-center x 

find a longest path P in T (using Fact 9) 
let Mi and U2 be the end vertices of P 

find the middle point of this path: i.e. , the point x such that d{x,U\) = d(x,U2) 


10. 1 -center of a tree : This algorithm (Algorithm 2) applies to “unweighted” trees, in 
which there are identical weights at each vertex. It is based on Facts 8 and 9. 

11. Algorithm 2 can be implemented to run in O(n) time. 

Examples: 

1. Suppose that the vertices of the tree T in Figure 2 (§15.2.2) are labeled vi, V2, ■ • • , t’s 
in order from top to bottom and left to right at each height. Algorithm 1 can be applied 
to find the 1-median of T. First, the leaf vertex V\ is selected and since W(v i) = 1 is 
less than = -y, its weight is added to vertex M3. The following table shows the 
progress of the algorithm, which eventually identifies vertex V4 as the 1-median of T. 
As guaranteed by Fact 5, M Vi = max{6, 1, 3,4} < y . 


iteration 

V 

1 

2 

3 

W( Vi ) 

4 5 

6 

7 

8 

0 


1 

1 

1 

1 

3 

4 

2 

2 

1 

Ml 

— 

1 

2 

1 

3 

4 

2 

2 

2 

V2 

— 

— 

2 

2 

3 

4 

2 

2 

3 

V 5 

- 

- 

2 

5 

- 

4 

2 

2 

4 

V6 

- 

- 

6 

5 

- 

— 

2 

2 

5 

V 3 

- 

- 

- 

11 

- 

- 

2 

2 
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2. Let the vertices of the tree T in the figure of §15.2.2 Example 2 be labeled as in 
Example 1 of this section. Algorithm 2 can be applied to find the 1-center of T, with 
all vertex weights being 1. First, select i>i and find a farthest vertex from it, namely v 5 . 
A farthest vertex from i > 5 is then Vg, giving a longest path P = [u 5 , V4, V7, t>g] in T. The 
midpoint x of P, located \ unit from along edge (i> 4 ,i> 5 ), is then the 1-center of T. 
If instead the longest path Q = [i^, vg, Up U 5 ] in T had been identified, then the same 
midpoint x would be found. 


15.2.4 CAPACITATED LOCATION PROBLEMS 
Definitions: 

Let X p = {x \ , . . . , x p } be a set of locations for p facilities in the metric space S with n 
demand points V = {/Ui, . . . , u ra } C S where w(vi) > 0 is the demand at 1 \ £ V. 

For each Vi £ V and Xj £ X p , let w(vi,Xj) be the portion of the demand at Vi satisfied 
by the facility at Xj. 

Let W ('Xj ) be the sum of the demands satisfied by (or allocated to) the facility at Xj. In 
a capacitated location problem , upper (and/or lower) bounds are placed on W(xj). 

Given the positive integer p and positive constant a, two versions of the capacitated 
p-median (CPM) problem in network G = (V. E) are defined: 

(a) Find a set of locations X p such that F(X p ) = Yly-ev w ( v i)d{ v i, X p ) is min- 
imized subject to W(xj) < a for all Xj £ X p . Here it is assumed that the 
demands are satisfied by their closest facility, and in the case of ties, a demand, 
say at v, may be allocated in an arbitrary way among the closest facilities to v. 

(b) Find X p and { w(yi, Xj) \ Vi £ V and Xj € X p } to minimize 

p n 

E E w(vi,Xj)d(vi,Xj) 

j=li=l 

subject to 

n 

E w(vi,Xj)<a, j = 1,2, . . . ,p 

i = 1 
P 

E w{vi,Xj) = w(vi), i = 1,2, . . . ,n. 

3 = 1 


Facts: 

1. Capacitated facility location problems occur in several applied settings, including: 

• the location of manufacturing plants (with limited output) to serve customers; 

• the location of landfills (with limited capacity), which receive solid waste from 

the members of a community; 

• the location of concentrators in a telecommunication network, where each con- 

centrator bundles messages received from individual users and can handle only 
a certain amount of total message traffic. 

2. Version (a) of CPM may not have a solution if a is too small. 

3. If a is sufficiently large in version (a) and CPM has a solution, there may not exist 
a solution consisting entirely of vertices of G. See Example 1. 

4. Version (b) of the CPM has a solution consisting entirely of vertices of G. This was 
shown by J.Levy; see [HaMi79]. 
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Examples: 

1. Suppose p = 2 and a = 3 in the following network G. A solution to version (a) of 
the CPM problem consists of the points X 2 = {£1,2:2} with W(xi) = W(x 2) = 3 and 
F(X 2) = 9. It is easy to see that the choice of any two vertices for X 2 would violate 
the allocation constraint to one facility. 



2 . If p = 2 and a = 3 for the network G of the figure of Example 1, then version (b) 
of the CPM problem has a solution containing only vertices of G. Suppose that the 
top two vertices in the figure are v\ and v 2 (from left to right) and the bottom two 
vertices are V 3 and tq. Then X 2 = {rq, v 2 } is an optimal solution, where all the demand 
w(v 3) = 1 is allocated to v\ whereas the demand w(v 4) = 2 is equally split between v\ 
and v 2 . (Here, not all demand from tq is sent to its closest facility v 2 .) In this solution, 
W(v 1) = W(v 2 ) = 3 and F(X 2 ) = 7. 


15.2.5 FACILITIES IN THE PLANE 

The p-median and p-center problems can be defined in the plane 7 Z 2 . Several measures 
of distance are commonly considered for these location problems in the plane. 

Definitions: 

Let x = (x 1 ,x 2 ) and y = (yi,y 2 ) be points of S = 1Z 2 . 

The Euclidean (£ 2 ) distance between x and y is d(x, y) = [{x\ — y \) 2 + (x 2 — y 2 ) 2 ] 1 / 2 . 
The rectilinear (£1) distance between x and y is d(x,y ) = |xi — y\ \ + \x 2 — y 2 \ ■ 

The ( generalized ) Weber problem is the p-median problem in 1Z 2 with t 2 as the 
measure of distance. 

The unweighted Euclidean 1-center problem is the 1-center problem in 1Z 2 with 
the t 2 measure of distance and with w(vi) = c for all v- t £ V. 


Facts: 

1. No polynomial-time algorithm for the Weber problem, even when p = 1, has been 
discovered. 

2 . In practice, an iterative method due to Weiszfeld [FrMcWh92] has been shown to 
be highly successful for the Weber problem with p = 1. 

3 . The p- median and p-center problems in 7T 2 with either l\ or t 2 as the measure of 
distance have been proven to be NP-harcl if p is an input variable [MeSu84]. 
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4 . The unweighted Euclidean 1-center problem is equivalent to finding the center of 
the smallest (radius) circle that encloses all points in V. 

5 . The following table provides a summary of time complexity results of the best known 
algorithms for location problems in the plane [Me83, MeSu84]. 



p arbitrary 

P= 1 

p-median 

NP-hard if p is an input variable 
under t\ or I 2 

unknown complexity if p is fixed 

unknown complexity even when p = 1 
under l\ or 1 2 

p-center 

NP-hard if p is an input variable 
under l\ or 

unknown complexity for fixed 
p > 1 

0(n log 2 n) under £2 

0{n) under £2 in unweighted case 

0(n ) under £\ for both the weighted 
and unweighted cases 


Examples: 

1. The floor plan of a factory contains existing machines A, B, C at the coordinate 
locations a = (0,4), b — (2,0), c = (5,2). A new central storeroom, to house materials 
needed by the machines, is to be placed at some location x = (aq, X 2 ) on the factory floor. 
Because the aisles of the factory floor run north-south and east-west, transportation 
between the storeroom and the machines must take place along these perpendicular 
directions. For example, the distance between the storeroom and the machine C is 
| cc 1 — 5 1 -I- | CC 2 — 2 1 . Management wants to locate the storeroom so that the weighted sum 
of distances between the new storeroom and each machine is minimized, taking into 
account that the demand for material by machine A is twice the demand by machine B, 
and demand for material by machine C is three times that by machine B. This is a 
weighted 1-median problem in the plane with the t\ measure of distance. The point (3, 2) 
is an optimal location for the storeroom. In fact, for any 2 < u < 5 the point (u, 2) is 
also an optimal location. 

2. Suppose that the £2 distance measure is used instead in Example 1. Then a weighted 
1-median is a location x = ( Xi,X 2 ) that minimizes 

2y/xf + ( x 2 — 4) 2 + y/ (xi — 2) 2 + x\ + Zy/ (x\ — 5) 2 + (aq — 2) 2 . 

The minimizing point in this case is (5,2), which is the unique optimal location for the 
storeroom. On the other hand, if the demands for material are the same for all three 
machines, then the unweighted 1-median occurs at the unique location (2.427, 1.403). 


15.2.6 OBNOXIOUS FACILITIES 

In the preceding subsections, it has been assumed that the consumers at ty wish to be 
as close as possible to a facility. That is, the facilities are desirable. In contrast, this 
subsection discusses location problems where the facilities are undesirable or obnoxious. 

Definitions: 

For Vi £ V, w(vi)d(vi, X p ) represents the utility (in contrast to cost) associated with 
having an obnoxious facility located at distance d(vi,X p ) from ty. 
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The following two obnoxious facility location problems are defined: 

n 

(a) find X p C S to maximize F(X p ) = ^ w{vi)d(yi, X p )\ 

i—i 

(b) find X p C S to maximize G(X p ) = min w(Vi)d(vi,X p ). 

l<i<n 

If space S' is a network G = ( V. E), for each edge e = (u,v) £ E, let x(e) = x be the 
point on e such that w(u)d(u,x) = w(v)d(v,x) = w(e). 

Facts: 

1. If S is a network, problem (a) may not have a solution that is a subset of vertices 
(see Example 2). 

2. Suppose S is a network G with w(v{) > 0 for all £ V; further assume that at most 
one point of X p can be on any particular edge. Renumber the m edges of G so that 
w{e\) > w(e 2 ) > • • • > w(e m ). Then x(ei),x(e 2 ), • • • , x(e p ) is a solution to problem (b). 

3. Additional results on this subject can be found in [BrCh89]. 

Examples: 

1. In the location of obnoxious facilities, the distance to a closest facility is to be made 
as large as possible. This type of problem arises in siting nuclear power plants, sewage 
treatment facilities, and landfills, for example. 

2. In the following network, a solution X\ to problem (a) when p = 1 is the midpoint 
of any edge, and F(X 1 ) = 5. If the facility is located at any vertex v then E({ c}) = 4. 



15.2.7 EQUITABLE LOCATIONS 

The p- median problem is a widely used model for locating public or private facilities. 
However, it may leave some demand points (communities) too far from their closest 
facility and thus be perceived as inequitable. To remedy this situation, the p-median 
problem can be modified in several ways. 

Definitions: 

Suppose S' is a metric space, V C S, w(i>i) > 0 for all v-i £ V, and p is a positive integer. 
Let w'(vi) = w(vi)/Z2j =1 w(vj) and define F'(X P ) = YJi=i w'{vi)d{vi,X p ), Z(X P ) = 
EIli w'(v t ){d(vi,X p ) ~ F'(X P )} 2 . 
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The following three equitable facility location problems are defined: 

(a) given a constant (3, find a set of p points X p C S to minimize 

F(X p ) = Yli=l w ( v i) d ( v h X p) 

subject to 

d(vi,X p ) < (3, for all £ F; 

(b) given a constant a, 0 < a < 1, find a set of p points X p C S to minimize 
aF(X p ) + (l-a)H(X p y, 

(c) find X p to minimize Z(X p ). 


Facts: 

1. Since the objective function in (b) is a linear combination of the objective functions 
for the p-median and p-center problems, the solution X* is called a centdian. 

2. F'(X p ) = F(X p )/Y^j~ i w ( v j) is the mean distance to the consumers given that the 
facilities are located at X p . 

3. Z(X p ) is the variance of the distance to the consumers given that the facilities are 
located at X p . 

4. Additional results are discussed in [HaMi79, Ma86]. 

Example: 

1. The following figure shows a tree network T on 7 vertices, with edge lengths dis- 
played. Suppose that all vertex weights are 1. Then the 1-median of T is located 
at vertex c, while the 1-center of T is located at the point x one unit from vertex d 
along (d,g). These locations can be calculated using Algorithms 1 and 2 from §15.2.3. 
It can be verified that the centdian of T is at point x for 0 < a < g, at vertex d for 
| < a < mrd at vertex c for | < a < 1. 



1 5.3 PACKING AND COVERING 

Many practical problems can be formulated as either packing or covering problems. In 
packing problems , known activities are given each of which requires several resources for 
its completion. The problem is to select a most valuable set of activities to undertake 
without using any common resources. Such a problem arises for example in scheduling 
as many computational activities as possible on a set of machines (resources) that cannot 
be used simultaneously for more than one activity. 
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In covering problems , a specified set of tasks must be performed, and the objective 
is to minimize the resources required to perform the tasks. For example, a number of 
delivery trucks (resources) operating on overlapping geographical routes need to be dis- 
patched to pick up items at customer locations (tasks). The fewest number of trucks are 
to be sent so that each customer location is “covered” by at least one of the dispatched 
trucks. 

Both exact and heuristic solution algorithms for packing and covering problems are 
discussed in this section. 


15.3.1 KNAPSACKS 

The knapsack problem arises when a single critical resource is to be optimally allocated 
among a variety of options. Specifically, there are available a number of items, each of 
which consumes a known amount of the resource and contributes a known benefit. Items 
are to be selected to maximize the total benefit without exceeding the given amount of 
the resource. Knapsack problems arise in many practical situations involving cutting 
stock, cargo loading, and capital budgeting. 

Definitions: 

Let N = {1,2,..., n} be a given set of n items. Utilizing item j consumes (requires) 
cij > 0 units of the given resource and confers the benefit Cj >0. 

The knapsack problem ( KP ) is the following 0-1 integer linear programming problem: 

maximize: c j x j 

jeN 

subject to: a j x j < b (1) 

jCN 

Xj G {0,1} 

It is assumed that aj < b for all j G N. Let z(b) denote the optimal objective value 
in (1) for a given integer b. 

The LP relaxation (§15.1.8) of (1) is the linear programming problem: 

maximize: c j x :i 

jCN 

subject to: djXj <b (2) 

jCN 

0 < xj < 1 

Let Zlp denote the optimal objective value to the LP relaxation (2). 

Let N 1 and N° be the set of variables taking values 1 and 0, respectively, in the optimal 
solution to (2). Let A* be the dual variable (§15.1.5) associated with the knapsack 
inequality in the optimal solution to (2). 

A cover is a set S C TV such that J2jeS a i > b- The cover S is minimal if no proper 
subset of S is a cover. 

A branch and bound tree for KP is a tree T whose nodes correspond to subproblems 
obtained by fixing certain variables of (1) to either 0 or 1. The root of T corresponds 
to the original problem (1). 
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Algorithm 1 : Greedy heuristic for the KP. 

input: KP with > & > • • • > ^ 
output: feasible solution x = (x\,X 2 , • • ■ , x n ) 

for k := 1 to n 

if £-=i a,jXj < b then Xk '■= 1 
else Xk '■= 0 


A node t of T is specified by its level k = 0, 1, . . . , n and the index set N t C {1, . . . , k} 
of variables currently fixed to 1. 

Associated with the set N t is the current benefit z t = ^2j GNt Cj and the available amount 
of resource bt = b — . gJVt a r 


Facts: 

1. Formulation (1) is a 0-1 integer linear programming problem with a single (knapsack) 
constraint. It expresses the optimization problem in which a subset of the n items is 
to be selected to maximize the total benefit without exceeding the available amount of 
the given resource (the capacity of the knapsack). 

2. KP is an NP-harcl optimization problem (§16.5.2). 

3. KP can be solved in polynomial time for fixed b. 

4. Given a rational e > 0, a {0, 1}- vector x* can be found satisfying J2jeN a :i x *j — & 
and "YhjzN c j x j > (1 — e)z(6) in time polynomially bounded by * and by the sizes of 
a = (ai, . . .,a n ), c = (ci, . . . ,c n ), and b. 

5. If the coefficients aj can be ordered such that each coefficient is an integer multiple 
of the previous one, then KP can be solved in polynomial time. 

6. If a j-i > aj + ■■■ + a n holds for j = 2, ... ,n then KP can be solved in polynomial 
time. 

7. Greedy heuristic: This heuristic (Algorithm 1) for the KP processes the variables Xj 
in nonincreasing order of making each variable equal to 1 if possible. 

8. Suppose Zh is the objective value for the solution x produced by Algorithm 1. Then 
z(b) > z H > \ z{b ). 

9. Algorithm 1 is most effective if the coefficients aj are small relative to b. 

10. The LP relaxation (2) can be solved explicitly by filling the knapsack in turn with 
items j in order of nonincreasing — , ignoring the integer restriction. The solution x 
obtained has at most one fractional component. 

11. Core heuristic: This heuristic (Algorithm 2) for the KP first solves the LP re- 
laxation (2) as in Fact 10, in which at most one variable Xk can be fractional. A 
smaller knapsack problem is then solved, by setting to 0 any variable Xj with index j 
sufficiently greater than k and by setting to 1 any variable Xj with index j sufficiently 
smaller than k. 
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Algorithm 2: Core heuristic for the KP. 

input: KP with ^ > a > . . . > 

L a 1 Q -2 o n 

output: feasible solution x = (x\,X 2 , • • ■ , x n ) 

{solve LP relaxation} 

find the smallest value k such that Ej= i a j > & 
{solve restricted KP} 
select any r > 0 
Xj := 1 for j < k — r 
Xj := 0 ior j > k + r 

solve to optimality the smaller knapsack problem: 
maximize £j=fclr+i CjXj 

subject to Yl k jtk-l + i < b - E?=i a o 

Xj G {0, 1} 


12. Algorithm 2 is effective if the number of variables n is large since values of r 
between 10 and 25 give very good approximations in most cases. For further details see 
[BaZe80] . 

13. z(b) < zlp- 

14. If z l is the objective value of a feasible solution to (1), then z l < z(b). 

15. Node t of the branch and bound tree T corresponds to a subproblem having a 
nonempty set of feasible solutions if and only if b t > 0. When this holds, Zt is a lower 
bound for z(b). 

16. An upper bound on the objective value over the subproblem corresponding to node 
t is 2 t “ = where z* =z t + max { E"=fc+i c j x j I E"= fe +i a j x j < b t , 0 < Xj < 1 }. 

17. Implicit enumeration: This is an exact technique based on the branch and bound 
method (§15.1.8). It is implemented using a branch and bound tree T, with the following 
specifications: 

• The initial tree T consists of the root t, with lower bound z l on z(b) obtained 

using the greedy heuristic. An upper bound for node t is = Zlp- 

• If < z , then node t is discarded since it cannot provide a better solution. 

• If Zt>z l , there are three cases (where node t is at level k of T ): 

o Ofc+i < b t : If k + 1 < n, create a new node with Xk+i = 1. If k + 1 = n an 
optimal solution for node t has x n = 1. Since this solution is feasible for 
KP, set z l = Z? and discard node t. 

o Ofc+i = b t : An optimal solution for node t (and a feasible solution for KP) 
is obtained by setting Xk+i = 1 and x 3 = 0 for j > k + 1. Set z l = zf and 
discard node t. 

o a,k + i > b t : Discard the (infeasible) node with Xk+i = 1 and create a new 
node with Xk+i = 0. 

• To backtrack from node t let N t = {j i, . . . , j r } Q {1, . . . , k} with ji < • • • < j r ■ 

If k / retreat to level j r and set Xj r =0. If k = j r retreat to level j r -i and 
set Xj r l = 0. 
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18 . Variable fixing-. Given Zlp, z, TV 1 , TV 0 , and A*, the following tests can be used 
to fix variables and reduce the size of the knapsack problem: 


• If k £ TV 1 and Zlp — (c/; — A *ak) < z 1 , then fix Xk = 1. 

• If k £ TV 0 and Zpp + (c*, — X* ak) < z l , then fix Xk = 0. 

• Given k £ TV 1 define 


z'Ip = Cj + max 

jgATl _ {fc} 


{ s 


C q X q 


jeN-N 1 


Y2 a,jXj < b, 0 < Xj 
jeN-N 1 


If -lp < then Xk can be fixed to 1. 
• Given k £ N° define 


Zp P = Ck + maxj CjXj Yh a j x j <b — cik, 0 < x 

^ A/- AT AT 0 A r~ AT AT 0 


'-jeN-N 0 


jeN-N° 




If -lp < then Xk can be fixed to 0. 


19 . Minimal cover inequality: If S' is a minimal cover then each feasible solution x 
to KP satisfies Y^jes x i — 1^1 — 1- 

20 . Lifted minimal cover inequality : The minimal cover inequality can be further 
strengthened. Without loss of generality, assume that a\ > a ,2 > • • • > a n and S = 
{ji < J 2 < ■ ■ ■ < jr}- Let fih = Y2k= l a 3 k f° r h = 1) • • • , t and define X = fi r — b > 1. 
Then each feasible solution x to KP satisfies Ylj<eN-s a .i x 3 + 'zVjeS x .i — 1^1 — I; w h ere: 

• if Hh < a j < gh+i — A then Oj = h\ 

• if Hh+i — A + 1 < a,j < fih + 1 — 1 then: (a) aj £ {h, h + 1}, and (b) there is at 

least one lifted minimal cover inequality with aj = h + 1. 

21. Algorithms and computer codes to solve knapsack problems are given in [MaTo90]. 

22 . Fortran code for solving knapsack problems can be found at the site: 

• http://www.netlib.org/toms/632 

23 . Further details on the material in this section are available in [NeWo88], [Sc86]. 


Examples: 

1. Investment problem: An investor has $50,000 to place in any combination of five 
available investments (1,2, 3, 4, 5). All investments have the same maturity but are 
issued in different denominations and have different (one-year) yields, as shown here: 


investment 

1 

2 

3 

4 

5 

denomination ($) 

10,000 

20,000 

30,000 

10,000 

20,000 

yield (%) 

20 

14 

18 

9 

13 


Let variable Xj = 1 if Investment j is selected and Xj = 0 if it is not. The interest 
earned for Investment 1 is (0.20)10,000 = 2,000; the values of the other investments are 
found similarly. Then the investor’s problem is the knapsack problem 

maximize: 2,000 .Ti + 2,800x2 + 5,400x 3 + 900x4 + 2,600x 5 

subject to: 10,000xi + 20,000x2 + 30,000x 3 + 10,000x4 + 20,000xs < 50,000 

which has the optimal solution Xi = x 3 = X4 = 1, X2 = X5 = 0 with maximum interest 
of $8,300. 
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2. Consider the knapsack problem in 0-1 variables x 


maximize: 30x’i + 822 + 162:3 + 20x’4 + 120:5 + 92:6 + 5x7 + 3xs 

subject to: lOxq + 32:2 + 7x3 + 9x4 + 6x5 + 5x6 + 3x’7 + 2xs < 27. 

Here the variables Xj are indexed in nonincreasing order of . The optimal solution to 
the LP relaxation (2) is xi = X2 = X3 = 1, X4 = Xj = 0 otherwise, with zlp = 69|. 
The greedy heuristic (Algorithm 1) gives the feasible solution xi = X2 = X3 = X5 = 1, 
Xj = 0 otherwise, with zh = 66. Using r = 3, the core heuristic gives X\ = X2 = X4 = 
X6 = 1, Xj = 0 otherwise. This solution is optimal, with objective value 67. 

3. Consider the knapsack problem in 0-1 variables x 

maximize: Xi + X2 + X3 + X4 + X5 + X6 

subject to: 10xi + 8x2 + 4x3 + 3x4 + 3x’5 + 2x’6 < 11. 

The set S = {3,4, 5, 6} is a minimal cover which gives the lifted minimal cover inequality 
3xi + 2x2 + X3 + X4 + X5 + X6 < 3. Adding this inequality and solving the resulting linear 
program gives X4 = X5 = X6 = 1, Xj = 0 otherwise. This solution is optimal. 

4. General knapsack problem: The general (or unbounded) knapsack problem allows 
the decision variables Xj to be any nonnegative integers, not just 0 and 1. The following 
site provides an interactive algorithm for solving such knapsack problems (having up 
to 10 integer variables): 

• http : / / www . maths . mu . oz . au/“moshe/ recor/knapsack/knapsack . html 


15.3.2 BIN PACKING 

Minimizing the number of copies of a resource required to perform a specified set of 
tasks can be formulated as a bin packing problem. It is assumed that no such task can 
be split between two different units of the resource. 

For example, this type of problem arises in allocating a set of customer loads to 
(identical) trucks, with no load being split between two trucks. Also, the scheduling 
of heterogeneous tasks on identical machines can be viewed as a bin packing problem. 
Namely, find the fewest number of machines of capacity C such that each task is executed 
on one of the machines and the total capacity of jobs assigned to any machine does not 
exceed C. 

Definitions: 

The positive integer C denotes the bin capacity. 

Let L = {jpi , p 2 , ■ ■ ■ , p n ) be a list of n items , where item p, ; has an integer size s(pt ) < C. 

A subset P C L is a packing if E PiG p *(Pi) < c. 

The gap of a packing P is given by the quantity C — E PiG P s (Pi)- 

The bin packing problem is the problem of finding the minimum number of bins 
(each of capacity C) needed to pack all items so that the gap in each bin is nonnegative. 
The minimum number of bins needed for the list L is denoted b* (L) . 
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Algorithm 3: MFFD algorithm for bin packing. 

input: list L, bin capacity C 
output: a packing of L 

partition L into the three sublists La = {Pi \ s(pi) G (\C,C]}, 

L d = {Pi | s(pi) G (^C, \C] }, L x = {pi | s{pi) G (0, ijC'] } 
pack the sublist La using the FFD algorithm. 

{pack as much of L D into .4-bins as possible} 

1. let bin Bj be the .4-bin with the currently largest gap; if the two smallest 

unpacked items in Ld will not fit together in Bj , go to 4 

2. place the smallest unpacked item pi from Ld in Bj 

3. let pk be the largest unpacked item in Ld that will now fit in Bj : place pk 

in Bj and go to 1 

4. combine the unpacked portion of Ld with Lx and add these items to the 

packing using FFD 


Facts: 


1. The bin packing problem is an NP-hard optimization problem (§16.5.2). 


2. First Gt ( FF ) method: In this heuristic algorithm, item pi (i = 1,2, , n) is placed 
in the first bin into which it fits. A new bin is started only when pi will not fit into any 
nonempty bin. 

3. Let b FF (L) denote the number of bins produced by the FF algorithm for a list L. 
Then b FF (L) < min{ \j^b*(L)] , 1.756* (L)}. 

4. First Gt decreasing (FFD) method: In this heuristic algorithm, the items are first 
ordered by decreasing size so that s(pi) > s(p 2 ) > • • • > s(p n ). Then the FF algorithm 
is applied to the reordered list. 


5. Let b FFD (L) denote the number of bins produced by the FFD algorithm for a list L. 
Then b FFD (L) < min{^6*(L) + 3, 1.56*(L)}. 

6. If all item sizes are of the form C(\y , j > 0, for some fixed positive integer k, then 
b FFD (L) = b*(L). 


7. If the item sizes are uniformly distributed on [0, a] with 0 < a < 


ically 


b FFD (L) i 
b* (L) 


f, then asymptot- 


8. ModiGed Grst Gt decreasing (MFFD) method: This heuristic method (Algorithm 3) 
produces a packing using relatively few bins. After the initial phase of packing the 
largest size items La , let an “A-bin” denote one containing only a single item from La- 

9. Let b MFFD (L) denote the number of bins produced by the MFFD algorithm for a 
list L. Then asymptotically, as b*(L) gets large, b MFFD (L) < 1.1836*(L). 


10. Best Gt (BF) method : In this heuristic algorithm, item pi is placed in the bin into 
which it will fit with the smallest gap left over. Ties are broken in favor of the lowest 
indexed bin. 


11. Best Gt decreasing (BFD) method: In this heuristic algorithm, the items are first 
ordered so that s(pi) > s(p 2 ) > • • • > s(p n ). Then the BF algorithm is applied to the 
reordered list. 
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12 . Asymptotic worst-case bounds for BF [BFD] are the same as those for FF [FFD]. 
In practice the BF version performs somewhat better. 

13 . Further details on the material in this section are provided in [CoGaJo84]. 

Examples: 

1. Television commercials are to be assigned to station breaks. This is a bin pack- 
ing problem where the duration of each station break is C and the duration of each 
commercial is s(pi). 

2. Material such as cable, lumber, or pipe is supplied in a standard length C. Demands 
for pieces of the material are for arbitrary lengths s(pi) not exceeding C . The objective 
is to use the minimum number of standard lengths to supply a given list of required 
pieces. This is also a bin packing problem. 

3 . A set of independent tasks with known execution times s(pi) are to be executed on 
a collection of identical processors. Determining the minimum number of processors 
needed to complete all tasks by the deadline C is a bin packing problem. 

4 . Consider the list L = (4, . . . , 4, 7, . . . , 7, 8, . . . , 8, 13, . . . , 13), in which there are 
twelve 4s and six each of 7s, 8s, and 13s in the list. Each bin has capacity C = 24. 
Either FF (or BF) when applied to L result in a packing with twelve bins: two bins are 
packed as (4, 4, 4, 4, 4, 4), two as (7,7,7), two as (8,8,8), and six as (13). 

5. If FFD (or BFD) is applied to the list in Example 4, a packing with ten bins results: 
six bins are packed as (13, 8), two as (7, 7, 7), and two as (4, 4, 4, 4, 4, 4). 

6. If MFFD is applied to the list in Example 4, then La contains the six 13s and Ljj 
contains the remaining items. Packing La using FFD results in six A-bins, each con- 
taining a single 13 and having gap 11. Steps 1-3 of Algorithm 3 result in six bins 
packed as (13, 7, 4), and Step 4 yields two bins packed as (8, 8, 8) and one bin packed as 
(4, 4, 4, 4, 4, 4). This is an optimal solution since all nine bins are completely packed. 


15.3.3 SET COVERING AND PARTITIONING 

Set covering or set partitioning problems arise when a specified set of tasks must be 
performed while minimizing the cost of resources used. Such problems arise in scheduling 
fleets of vehicles or aircraft, locating fire stations in an urban area, political redistricting, 
and fault testing of electronic circuits. 

Definitions: 

Let e denote the column vector of all Is. 

Let A = ( a,ij ) be a 0-1 incidence matrix and let c = (cj) be a row vector of costs. 
The set Aj = {i \ a,ij = 1 } contains all rows covered by column j. 

The set covering (SC) problem is the 0-1 integer linear programming problem: 

minimize: cx 

subject to: Ax > e 

Xj G {0,1}. 

Let v* be the optimal objective value to this problem. 
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The set partitioning ( SP ) problem has the same form as the set covering problem 
except the constraints are Ax = e. 

The LP relaxation of SC or SP is obtained by replacing the constraints Xj £ {0, 1} 
by 0 < Xj < 1. Let vpp be the optimal objective value to the LP relaxation. 

The matrix A is totally unimodular if the determinant of every square submatrix 
of A is 0, +1, or —1. 

The matrix A is balanced if A has no square submatrix of odd order, containing exactly 
two Is in each row and column. 

The matrix A is in canonical block form if, by reordering, its columns can be par- 
titioned into t nonempty subsets B lt . . . ,B t such that for each block Bj there is some 
row i of A with = 1 for all k € Bj and a*fc = 0 for k £ U* =J+1 i?i. The rows of A are 
then ordered so that the row defining Bj becomes the jth row for j = 1, ... ,t. 

Facts: 

1. Formulation SC expresses the problem of selecting a set of columns (sets) that 
together cover all rows (elements) at minimum cost. In Formulation SP, the covering 
sets are required to be disjoint. 

2. Both SC and SP are NP-hard optimization problems (§16.5.2). 

3 . Checking whether a set partitioning problem is feasible is NP-hard. 

4 . In many instances (including bin packing, graph partitioning, and vehicle routing) 
the LP relaxation of the set covering (partitioning) formulation of the problem is known 
to give solutions very close to optimality. 

5. For the bin packing and vehicle routing problems (see Examples 2, 3) v* < | \vlp\- 

6. If A is totally unimodular or balanced, then the polyhedra {x \ Ax > e, 0 < Xj < 1} 
and {x | Ax = e, 0 < Xj < 1} have only integer extreme points (vertices). In this 
case, SC and SP can be solved in polynomial time using linear programming. 

7. Checking whether a given matrix A is totally unimodular or balanced can be done 
in polynomial time. 

8. Every 0-1 matrix that is totally unimodular is also balanced. The converse however 
is not true (see Example 4). 

9. The matrix A is totally unimodular if and only if each collection of columns of A 
can be split into two parts so that the sum of the columns in one part minus the sum 
of the columns in the other part is a vector with entries 0, +1, —1. 

10. Greedy heuristic: This heuristic (Algorithm 4) for the set covering problem suc- 
cessively chooses columns that have smallest cost per covered row. 

11 . Randomized greedy heuristic: This heuristic for the set covering problem is similar 
to Algorithm 4 except that at iteration k the column j k £ N k is selected at random from 
among those columns j satisfying ^ A .^ M k | < (1 + a) min { | A ^ M k \ | r £ N k }, where 
a > 0 . 

12. Whereas Algorithm 4 is run only once, the randomized greedy heuristic is repeated 
several times and the best solution is selected. 

13 . Implicit enumeration: This exact approach (Algorithm 5) for SP works well for 
dense matrices. In this algorithm, S is the index set of the variables fixed at 1, 2 is the 
associated objective value, and R is the set of rows satisfied by S. Also 2 * denotes the 
objective value of the best feasible solution found so far. 

14 . Other implicit enumeration approaches to set partitioning and set covering are 
discussed in [BaPa76]. 
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Algorithm 4: Greedy heuristic for the set covering problem. 

input: 0-1 m x n matrix A, costs c 
output: feasible set cover x 

M 1 := {1,2,..., m}; N 1 := {l,2,...,n}; k := 1 
{Main loop} 

select j k € N k to minimize 
N k+1 \= N k — {j k \ 

obtain M k+1 from M k by deleting all rows containing a 1 in column j k 
if M k+1 = 0 then Xj := 1 for j / (N k+1 and Xj := 0 otherwise 
else k := k + 1 

{Continue with next iteration of main loop} 


Algorithm 5: Implicit enumeration method for SP. 

input: 0-1 matrix A, costs c 

output: optimal set of columns S (if any) 

place A in canonical block form with blocks Bj 

order the columns within Bj by nondecreasing Ct / a it 

S := 0; R := 0; z := 0, z* := oo 

1. r := min { i \ i / }; set a marker in the first column of B r 

2. examine all columns of B r in order starting from the marked column 

if column j is found with ay = 0 for all i G R and z + Cj < z* then go to 3 
if B r is exhausted then go to 4 

3. S := S U {j}; R '■= Rk) {i \ aij = 1}; z := z + Cj 

if all rows are included in R then z* := z and go to 4 else go to 1 

4. if S = 0 then terminate with the best solution found (if any) 
else let k := the last index included in S 

S := S — {k}- update z and R 

B r \= the block to which column k belongs 

move the marker in B r forward by one column and go to 2 


15. Cutting plane methods: Cutting plane methods (§15.1.8) have been used success- 
fully to solve large set partitioning and set covering problems. For details regarding an 
implementation used to solve crew scheduling problems see [HoPa93]. 

16. Further details on the material in this section are in [GaNe72], [NeWo88], [Sc86]. 

Examples: 

1. Crew scheduling problem: An airline must cover a given set of flight segments with 
crews. There are specified work rules that restrict the assignment of crews to flights. 
The objective is to cover all flights at minimum total cost. The rows of the matrix A 
correspond to the flights that an airline has to cover. The columns of A are the incidence 
vectors of flight “rotations” : sequences of flight segments for each flight that begin and 
end at individual base locations and that conform to all applicable work rules. The 
objective is to minimize crew costs. This problem can be formulated as either a set 
covering or set partitioning problem. 
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2. Bin packing : The bin packing problem (§15.3.2) can be formulated as a set parti- 
tioning problem. The rows of the matrix A correspond to the items and the columns 
are incidence vectors of any feasible packing of items to a bin. The cost of each variable 
is 1 if the number of bins is to be minimized. In general, a weighted version can also be 
formulated where different bins have different costs. 

3. Vehicle routing: Given are a set of customers and the quantity that is to be supplied 
to each from a warehouse. A fleet of trucks of a specified capacity are available. The 
objective is to service all the customers at minimum cost. The rows of the matrix A 
correspond to the customers and the columns are incidence vectors of feasible assign- 
ments of customers to trucks (a bin packing problem). The cost of each variable is the 
cost of the corresponding assignment of customers to the truck. This problem can be 
formulated as either a set covering or set partitioning problem. 

4 . The following matrix A is not totally unimodular, since det(A) = — 2 . This can also 
be seen using Fact 9. If A has columns Cj then (Ci + C© — (C 3 + C 4 ) = (0,2,0,0) T 
has an entry greater than one in absolute value. However, A is a balanced matrix. 

/ 1 1 1 1 \ 

A= 1100 

1 0 1 0 I 
Vl 0 0 1 / 

5 . There are four requests R \ , R2 , R3 , R4 for information stored in a database, which 
is comprised of five large files {1,2, 3, 4, 5}. Request R\ can be fulfilled by retrieving 
files 1, 3, or 4; request R2 by retrieving files 2 or 3; request R$ by retrieving files 1 
or 5; and request R4 by retrieving files 4 or 5. The lengths of the files are 7, 3, 12, 7, 6 
(gigabytes) respectively, and the time to retrieve each file is proportional to its length. 
Filling all requests in the minimum amount of time is then a set covering problem, with 
costs c = (7, 3, 12, 7, 6 ) and incidence matrix 

/I 0 1 1 0 \ 

0 1 1 0 0 I 

A \l 0 0 0 1 I 

Vo 0 0 1 1 / 

Applying the greedy heuristic (Algorithm 4) produces j 1 = 2, j 2 = 5, and j 3 = 1, giving 
x = (1, 1,0,0, 1) with total cost 16. This is an optimal solution to the SC problem. 


15.4 ACTIVITY NETS 

Activity nets are important tools in the planning, scheduling, and control of projects. 
In particular, the CPM (Critical Path Method) and PERT (Program Evaluation and 
Review Technique) models are widely used in the management of large projects, such 
as those occurring in construction, shipbuilding, aerospace, computer system design, 
urban planning, marketing, and accounting. 


15.4.1 DETERMINISTIC ACTIVITY NETS 

The scheduling of large complex projects can be aided by modeling as a directed net- 
work of activities having known durations and resource requirements, with the network 
structure defining the activity precedences. The commonly used critical path method 
is described as well as extensions that address constrained resources, financial consider- 
ations, and project compression. 
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Definitions: 


A project is defined by a set of activities that are related by precedence relations. 
An activity consumes time and resources to accomplish, whereas a dummy activity 
consumes neither. 

Activity u (strictly) precedes activity v, written u -< v, if activity u must be completed 
before activity v can be initiated. 

A project can be represented using a directed acyclic network G (§8.3.4). 

In the activity-on-node ( AoN ) representation of a project, the network G contains 
a node for each activity and the arcs of G represent the precedence relations between 
nodes (activities). 

In the activity-on-arc (AoA) representation of a project, the network G contains an 
arc for each activity and the nodes of G represent certain events. Precedence relations 
are described by the network arcs, possibly requiring the use of dummy arcs (dummy 
activities) . In the AoA representation, the network is assumed to have no multiple arcs 
joining the same pair of nodes, so an activity can be unambiguously referred to by (i, j) 
for some nodes i and j , with the corresponding activity duration being oy. 

Network G is a deterministic activity net if the precedence relations and the param- 
eters associated with the activities are known deterministically. Such a network is also 
referred to as a Critical Path Method ( CPM ) model. 

An initial node of G has no entering directed arcs; a terminal node has no exiting 
directed arcs. 

Generalized precedence relations ( GPRs) relax the necessity of a strict precedence 
between activities. They can be specified in the form of certain lead or lags between a 
pair of activities, commonly by start-to-start, fi n isli - t o-fi n ish . start-to-Rnish , and 
Rnish-to-start relations. 

The optimal project compression problem is that of achieving a target project 
completion time with least cost, or alternatively minimizing the duration of the project 
subject to a specified budget constraint. 

The complex interaction between the required resources and the duration of an activity 
is assumed to be given by the functional relationship c a = 4>(y a ), where y a is the duration 
of activity a, t a < y a < u a , and c a is its cost. The upper limit u a is the normal duration 
and the lower limit t a is the crash duration of activity a. 


Facts: 

1. The CPM model arose out of the need to solve industrial scheduling problems; the 
original work was jointly sponsored by Dupont and Sperry-Rand in the late 1950s. 

2 . In the AoA representation, the network can be assumed to have a single initial node 
and a single terminal node. These conditions can in general be guaranteed, possibly 
through the introduction of dummy arcs. 

3 . Suppose that the AoA representation of a network has n nodes, with initial node 1 
and terminal node n. Then the nodes can always be renumbered (topologically sorted) 
such that each arc leads from a smaller numbered node to a larger numbered one. (See 
§8.3.4.) 

4 . In the AoA representation, the earliest time of realization of node j, written tj(E), is 
determined recursively from tj(E) = max iG B(j){U(E) + a,y} and t\(E) = 0, where B(j) 
is the set of nodes immediately preceding node j. 
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5. Suppose the time of realization of node n is specified as t n (L) > t n (E). The latest 
time of realization of node i, written t,;(L), is determined recursively from ti(L) = 
min_, gj 4 where A(i) is the set of nodes immediately succeeding node i. 

6. tj(L) > tj(E) holds for any node j. The difference tj(L) — tj(E) > 0 is called the 
node slack for j. 

7. For each activity (i,j) there are four activity floats corresponding to the differences 
tj(X) — U(Y) — a,ij, where X, Y € { E,L }: 

• total float: TF(i,j) = tj(L) — U(E) — a ij 

• safety float: SF(i,j ) = tj(L ) — U(L) — aij 

• free float: FF(i,j) = tj(E) — U(E) — 

• interference float: IF(i,j) = tj(E ) — U(L) — aij. 

8. TF(i,j), SF(i,j), and FF(i,j) are always nonnegative, whereas IF(i,j) can be 
negative, indicating infeasibility of realization under the specified conditions (all activi- 
ties succeeding node j are accomplished as early as possible and all activities preceding 
node i are accomplished as late as possible). 

9. A critical activity (i,j) has total float TF(i,j) = 0. If t n (L) = t n (E ), then the set 
of critical activities contains at least one path from node 1 to node n, which represents 
a longest path in the network from node 1 to node n. Such a path is called a critical 
path (CP). 

10. Floats play an important role in both resource allocation and activity schedul- 
ing, since floats give a measure of the flexibility in scheduling activities during project 
execution without delaying the project completion time. 

11. The problems of optimal resource allocation and activity scheduling subject to the 
known precedence constraints are NP-hard optimization problems (§16.5.2). 

12. Practical solutions to optimal resource allocation and activity scheduling problems 
are based on heuristics. Virtually all heuristics used in practice rely on ranking the 
activities according to their float (TF, SF, FF, IF). 

13. The measure TF is the only float in the AoA mode that is representation-invariant; 
this measure is the same in both modes of representation and in all AoA models of the 
same project. 

14. The SF,FF 7 IF measures are representation-dependent: they do indeed depend 
on the structure of the AoA and they may also vary from their AoN values. 

15. A simple redefinition of ti(E) for nodes i with all outgoing arcs dummy and tj(L) 
for nodes j with all incoming arcs dummy reestablishes the invariance of the activity 
floats to the mode of representation [ElKa90] . 

16. A plethora of “off-the-shelf” project planning and control software packages for 
PCs are currently available [Ho85] , [DeHe90] . The review [DeHe90] also outlines criteria 
against which a software package should be judged. To a varying degree of sophistica- 
tion, all these software packages satisfy the basic requirements of analysis and reporting. 
However, these packages are typically incapable of correctly carrying out optimization 
procedures. 

17. A listing of commercial and noncommercial software for project planning can be 
found at the site: 

• http : / / www . wior . uni-karlsruhe . de/Bibliothek/Title_Pagel . html 
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18 . GPRs afford the flexibility of modeling relations that are present among activities 
in many practical situations, gained at a price in computational effort and interpretation 
of results. The concepts of criticality and float of an activity take on a new meaning, 
since activities may be “compressed” (speeded up) or “expanded” (slowed down) from 
their “normal” durations [ElKa92]. 

19 . Considerations of resource availabilities are important in project planning and its 
dynamic control. The most common planning criteria are: 

• minimization of the project duration 

• smoothing of resource usage 

• minimization of the maximum resource utilization 

• minimization of the cost of resource usage 

• maximization of the present value of the project. 

20 . In the presence of limited resources, the “critical path” may no longer be a “path”, 
in the sense of a connected chain of activities. What emerges is the concept of a critical 
sequence of activities [E177], which need not form a connected chain in the network. 
(See Example 4.) 

21 . The scheduling of activities related by arbitrary precedence relations subject to 
resource availabilities is an NP-hard problem. Consequently, such problems are typically 
approached by integer programming techniques (§15.1.8) or heuristics (e.g., simulated 
annealing, tabu search, genetic algorithms, neural nets). 

22 . The book [SlWe89] discusses project scheduling under constrained resources; in 
particular, Chapter 5 of Part I evaluates various heuristics that have been proposed. 
Also [HeDeDe98] gives a review of recent contributions to this area. 

23 . Typically, resources are available in one or several units or may be acquired at a 
cost. Mathematical models (large-scale integer linear programs) abound for the min- 
imization of the project duration [E177] . Various branch and bound approaches have 
been proposed for these models [DeHe92, Sp94]. 

24 . Heuristic procedures, let alone optimization algorithms, for activity scheduling 
under the other criteria mentioned in Fact 19 are not generally available. 

25 . In the CPM model of activity networks there is little problem in defining the cost 
of an activity, and subsequently the cost of the project. 

26 . Generally, there are two streams of cash flow (from the contractor’s point of view): 
an in-stream representing payments to the contractor by the owner, and an out-stream 
representing payments by the contractor in the execution of the activities. 

27 . From the owner’s point of view there is only one stream of cash flows: namely, 
payments to the contractor for work accomplished. Given a particular schedule for the 
activities, the two streams of cash flow can be easily obtained. The problem is then 
scheduling activities to maximize the net present value (NPV) of the project. 

28 . Issues concerning the NPV of a project are equally important to those interested 
in bidding on a proposed project and those who are committed to carry out an already 
agreed-upon project. Succinctly stated, the problem is to determine the dates of the 
deliverables in order to maximize the NPV. 

29 . Suppose that the function (f>{y a ) is nonincreasing over the interval [( a iU a ]. Refer- 
ence [E177] gives a treatment of linear, convex, concave, and discrete functions, while 
[ElKa92] discusses the case in which <j)(y a ) is piecewise linear and convex over the interval 

[^a 5 U a ] . 
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Examples: 

1. Construction planning: The construction of a house involves carrying out the nine 
activities listed in the following table. Their durations and (immediate) predecessor 
activities are also indicated. 


activity 

duration (days) 

predecessors 

foundation/frame 

12 

— 

wiring/plumbing 

4 

foundation/frame 

sheetrock 

7 

wiring/plumbing 

interior paint 

2 

sheetrock, windows 

carpet 

3 

interior paint 

roof 

3 

foundation/frame 

siding 

7 

roof 

windows 

2 

siding 

exterior paint 

2 

windows 


An AoA representation of this project is shown in the following figure. It is necessary 
to use a dummy activity to ensure that the given precedences are faithfully depicted. 
The nodes have been numbered in topological order, with node 1 the initial node and 
node 9 the terminal node. The longest path from node 1 to node 9 is [1, 2, 4, 5, 6, 7, 8, 9] 
with length 29, corresponding to a project completion time of 29 days. 


sheetrock interior 



2. A project is composed of the four activities a, b, c, d with precedence relations a -< c, 
a -< d, and b -< d. The AoN representation of this project is shown in part (a) of the 
following figure. The AoA representation is shown in part (b) of the figure, where the 
nodes have been numbered in topological order (Fact 3). The dummy activity joining 
nodes 2 and 3 is needed to maintain the integrity of the precedence relations. Activity 
durations are indicated on the arcs of part (b) of the figure. 



(a) 


( 8 , 10 ) 
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3 . The earliest and latest event times tj(E),tj(L) are shown next to each node j in 
part (b) of the figure of Example 2, where t±(L) = 14, which is 2 units more than 
ti(E) = 12. Here all nodes have the same slack of 2, which provides no information 
on the various activity floats, given in the following table. Since TF(i,j) > 0 for all 
activities (i,j), there are no critical activities and no critical path, since a small delay in 
any activity will not delay the completion time of the project. The critical path can be 
determined if instead t±(L) = t^{E) = 12. Then the critical path is given by [1,2, 3, 4]. 


activity 

TF 

SF 

FF 

IF 

(1,2) 

2 

0 

0 

-2 

(1,3) 

5 

3 

3 

1 

(2,4) 

3 

1 

1 

-1 

(3,4) 

2 

0 

0 

-2 


4 . The following figure gives a project with six activities in AoN representation. There 
is a single resource, with availability of 6 units. The duration of each activity and the 
required quantity of the resource are indicated next to each activity (node). The CP 
(based solely on durations) is [1,3,5, 6] of duration 5. If the integrity of the CP is 
maintained as long as possible, then activity 4 must be inserted before activity 6 (thus 
breaking the continuity of the CP), which is then followed by activity 2, as shown in 
part (b) of the figure. The total duration of the project under this schedule is 11 time 
units. 

Now consider the schedule shown in part (c) of the figure, in which the CP is split 
after activity 1; the total duration of the project is thereby reduced to only 8 time units. 



( 2 . 3 ) 

duration ^ ^ resource 
(a) 
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5. The project of the following figure is shown in AoA mode, with the duration of 
each activity written beside each arc. The payment shown next to a node is the income 
accrued (if positive) or expense incurred (if negative) at the time of realization of that 
event (node). The CP is [1,3,4] with duration 11. Ignoring the time value of money (i.e., 
assuming a discount factor (3 = 0) gives 1000 as the estimate of project profit. Assuming 
a discount factor /3 = 0.99 and that activities are done as early as possible to maintain 
the CP, the estimate of project profitability shrinks to — 5000(.99) 2 + 3000(.99) 8 + 
3000(.99) 11 = 553.75. 

Now suppose that the schedule of activities is modified as follows: delay activ- 
ity (1, 2) to complete at time t 2 = 4 (instead of 2); do activity (1, 3) as early as possible 
to complete at time ts = 8; and do activity (2,4) as early as possible (after the real- 
ization of node 2) to complete at time 1 4 = 12. Then the project profitability increases 
to — 5000(.99) 4 + 3000(.99) 8 + 3000(.99) 12 = 624.41. Note that the increase in project 
profitability comes as a consequence of ignoring the CP, and in fact delaying the project 
beyond its normal duration. 


(-5000) 



6. A project involving five activities is shown in the following figure in AoA mode. 


3 



Each activity (arc) a is labeled with ( u a ,£ a ,k a ) where k a is the marginal cost of re- 
ducing duration from the normal time u a . Next to each node j is its earliest time of 
realization tj(E) under normal activity durations. The following table summarizes the 
breakpoints of the resulting piecewise linear cost function. 


breakpoint 

duration 

(U) 

marginal 

cost 

cumulative 

cost 

1 

11 

1 

0 

2 

10 

2 

1 

3 

9 

3 

3 

4 

8 

4 

6 

5 

4 

5 

22 

6 

3 

00 

27 
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The function itself is shown in the following figure. With the complete cost function 
in hand it is easy to answer various questions. For example, the least additional cost 
required to reduce the project duration from its normal value 11 to 7 is seen to be 10. 
Alternatively, if 6 additional units of money are available, then the maximum reduction 
achievable in the project duration is 3 units of time (from 11 to 8). 



15.4.2 PROBABILISTIC ACTIVITY NETS 

The CPM model can be extended to incorporate uncertainty or randomness. If the 
durations of activities are random variables, then the network is a PERT (Program 
Evaluation and Review Technique) model. Alternatively, the very undertaking of an 
activity may be determined by chance and this consideration has led to the development 
of GAN (Generalized Activity Network) models. 

Definitions: 

A probabilistic activity net is a directed network in which some or all of the param- 
eters, including the realization of the activities, are probabilistically known. 

In a PERT model , activity durations are random variables. The duration of activity 
a has expected value p a and variance c© 

Let P(t) be the probability that the project is completed by time r. 

The criticality index of a path Q in the network is the probability that Q is a critical 
path in any realization of the project. 

The criticality index of an activity a is the probability that a lies on a critical path 
in any realization of the project. 

A GAN model is a probabilistic activity net with conditional progress and probabilistic 
realization of activities. 

If A is a standard normal deviate (§7.3.1), then its (cumulative) distribution function 
is denoted by $(© = Pr{X < x). 
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Facts: 

1. The original PERT model evolved in the late 1950s from the U. S. Navy’s efforts to 
plan and accelerate the Polaris submarine missile project. 

2 . A detailed account of the original PERT model, its analysis, and the criticisms levied 
against it is found in [E177, Chapter 3]. 

3 . Estimation of the exact probability distribution function (pdf) of the project dura- 
tion is an extremely difficult problem due to the nonindependence of the paths leading 
from the initial node to the terminal node. 

4 . The original PERT model suggested substituting /i a for each activity duration and 
then proceeding with the standard CPM calculations to determine a critical path Q* in 
the resulting deterministic network. 

5 . The pdf of the duration of the project can then be approximated using a normal 
distribution having mean /Jq. = YlaeQ* A 4 a an d variance Sq, = ^ZaeQ* °a- The normal 
approximation increases in validity as the number of activities in the path Q* increases. 

6. The probability P(r) of project completion by time r can be approximated using 
P(r) = $(( T -p Q ,)/o Q ,). 

7. The value /2 q» always underestimates the exact mean project duration (often, seri- 
ously). No equivalent statement can be made about the variance estimate <5 q« except 
that it is often a gross approximation of the exact variance. 

8. PERT analysis goes one step further and uses an approximation to the expected 
value and the variance of each activity, based on the assumption that each activity 
duration follows a beta distribution (§7.3.1). In particular, the variance is approximated 
by Tj (range) 2 . These additional assumptions render the procedure even more suspect. 

9 . An immediate consequence of randomness in the activity durations is that (virtually) 
any path can be the CP in some realization of the project. Thus, the criticality index of 
a path and the criticality index of an activity are more meaningful concepts. See [Wi92] 
for a critique of the latter. 

10 . In general, it is extremely difficult to determine the exact values of the criticality 
indices analytically. Monte Carlo sampling is typically used to estimate these values. 

11 . Since the early days of PERT, significant strides have been made in estimating 
the various parameters in the PERT model. The approaches can be classified into the 
categories of exact, approximating, and bounding procedures. See [E189, Ka92]. 

12. The concept of a uniformly directed cutset has been used to evaluate some common 
network performance criteria under the assumption of exponentially distributed activity 
durations [KuAd86] . Attempts to extend the concept to applications in optimal resource 
allocation have had limited success thus far. 

13 . The restriction of GANs to “exclusive-or” type nodes renders the network a graph- 
ical representation of a semi-Markov process. The resulting GERT (Graphical Evalu- 
ation and Review Technique) model has been expanded into SLAM II, an extremely 
powerful discrete event simulation language. 

14 . The analysis of stochastic activity nets with exclusive-or type nodes ( STEOR-nets ) 
is thoroughly discussed in [Ne90] . 

Examples: 

1 . The following figure shows a project with six activities whose durations are random 
variables that assume discrete values with equal probabilities. For example, activity 
(1, 2) has duration 1, 2, or 5 with probability | each. The exact distribution of project 
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completion time (secured by complete enumeration of the 324 realizations) is shown 
in the following table, from which it is seen that the true mean project duration is 
/i = 12.315 and the true standard deviation is cr = 2.5735. The probability that the 
project duration is no more than 12 time units is P(12) = 0.4815. The PERT estimates 
of these same parameters, based on the deterministic critical path [1,2, 3, 4], are /Jq* = 
10, ctq. = 1, and P(12) = 0.9772. 


1 


duration 

(U) 

frequency 

relative 

frequency 

17 

36 

0.1111 

15 

12 

0.0370 

14 

48 

0.1481 

13 

72 

0.2222 

12 

42 

0.1296 

11 

30 

0.0926 

10 

36 

0.1111 

9 

28 

0.0864 

8 

10 

0.0309 

7 

6 

0.0185 

6 

2 

0.0062 

5 

2 

0.0062 


324 

1.0000 



2. The paths from initial node 1 to terminal node 4 for the project in the figure of 
Example 1 are: Q\ = [1,2,4], Q 2 = [1,2, 3, 4], Q 3 = [1,4], Q 4 = [1,3,4]. The following 
table lists the frequency and relative frequency that each path Qi, or combination of 
paths, is a critical path. 
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The path criticality indices are then easily determined from this table. For exam- 
ple, the criticality index of Q\ is 0.3951 (= 0.3457 + 0.0309 + 0.0185), and of Q 4 is 

0.2592 (= 0.1975 + 0.0185 + 0.0432). The criticality index of each activity can be easily 
determined from the criticality indices of the paths. For instance, the criticality index 
of activity (1,2), which lies on paths Q\ and Q 2 , is 0.7284 (= 1 — 0.0741 — 0.1975). 


15.4.3 COMPLEXITY ISSUES 
Facts: 

1. The AoN representation of a project is essentially unique. 

2. The AoA representation is not unique because of the necessity to introduce dummy 
activities (e.g., to maintain the integrity of the precedence relations). 

3 . Construction of the AoA representation can be carried out with different objectives 
in mind: to minimize the number of nodes, to minimize the number of dummy activities, 
or to minimize the complexity index of the resulting AoA network [MiKaSt93]. 

4 . Analytical solutions to optimization problems for project networks often proceed by 
conditioning upon certain activities, and then removing the conditioning through either 
enumeration or multiple integration. Minimizing the computing effort then involves 
minimizing the number of activities on which such conditioning takes place. 

5 . If the network is series-parallel then no conditioning is required and its analysis is 
straightforward, though it may be computationally demanding. 

6. If the network is not series-parallel, then the minimum number of activities for 
conditioning can be secured by the optimal node reduction procedure of [BeKaSt92], 
which has polynomial complexity. 

7 . Patterson [Pa83] collected a set of 110 standard test problems, useful for comparing 
alternative solution procedures. These problems have been supplanted by a more recent 
set of test problems [KaSpDr92]. 

8. Several measures of the complexity of a project network were proposed in the 1960s, 
with questionable validity. The significance of the complexity index [BeKaSt92] in 
accounting for the difficulty in analysis is discussed in [DeHe96] . 


15.5 GAME THEORY 

Games, mathematical models of conflict or bargaining, can be classified in three ways: 
by mood of play (noncooperative or cooperative), by field of application (e.g., biology or 
economics), and by mathematical structure (e.g., discrete, continuous, or differential). 
Correspondingly, game theory is a vast and diverse subject with different traditions in 
each of many specialties. 

This section discusses discrete games, in which finitely many strategies are avail- 
able to finitely many players. Combinatorial and other games form largely separate 
disciplines to which appropriate references appear in §15.5.4. 
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15.5.1 NONCOOPERATIVE GAMES 


This section discusses noncooperative games involving a finite number of players. Col- 
lusion among the players is not allowed in these types of games. Such games can model 
a wide variety of situations, as indicated in §15.5.4. 

Definitions: 

An n-player game T in extensive form consists of: 

• a set {1, . . . , n} U {0} of n decisionmakers (or players ) augmented by a fictitious 

player, called 0 (or chance ), whose actions are random 

• a tree, in which each nonterminal vertex represents a decision point for some 

player, whose possible actions correspond to arcs emanating from the vertex 

• a payoff function that assigns an n-vector to each terminal vertex 

• a partition of the nonterminal vertices into n + 1 vertex sets, one for each player 

and for chance 

• a subpartition of each player’s vertex set into subsets ( information sets), such 

that no vertex follows another in the same subset and all vertices in a subset 
are followed by the same number of arcs 

• a probability distribution on arcs emanating from any chance vertex. 

A subgame of T is a game whose tree is a subtree of the tree for T. A subgame is 
proper if the information set that contains its root contains no other vertices. 

A game is finite if its tree is finite. 

A game has perfect information if all information sets contain a single vertex; otherwise, 
it has imperfect information. 

A game has complete information if all players know the entire extensive form including 
all terminal payoffs; otherwise it has incomplete information. 

A pure strategy is a function that maps each of a player’s information sets to an 
emanating arc. 

An n-person game in normal (or strategic) form consists of a set N = {1,2,..., n} 
of players, a set Sk of possible pure strategies for each k £ N, and a payoff function 
/ = (/i, / 2 , • • • , fn) that assigns fk(w) to Player k for every pure strategy combination 
w = (w x ,w 2 , . . . , w n ), where ur £ Sk- Payoffs are computed by taking expected values 
over distributions associated with chance vertices in the corresponding extensive form. 

Let D C Si x £2 x • • • x S n be the set of all possible pure strategy combinations w. 

Let w 1 1 w k denote the joint pure strategy combination that is identical to w except for 
the strategy of Player k : 

w\\w k = (w%. . ,w k - x ,w k ,w k+1 , . . . ,w n ). 

w* £ D is a Nash equilibrium pure strategy combination (or simply equilibrium) 

if, for every k £ N, fk(w*) > fu{w* || w k ) holds for all w k £ Sk- (J- F. Nash, born 1928) 
Let E denote the set of all such equilibria. 

For k £ N define the function rrik that minimizes fk{w) over components of w that k 
does not control: 

m k (w k ) = _ min fk(w)- 

{w | w k —w k } 
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If w k maximizes rrik{w k ), then w k is a max-min strategy for k and f}. = mk{w k ) is 
the corresponding max-min payoff. 

Let D* = { w € D | fk(w) > fk for a\\ k £ N }. 

The strategy combination w is individually rational for all players if w £ D*. 

The combination w £ D is group rational (or Pareto-optimal ) if no w £ D exists 
such that fk(w) > fk{w ) for all k £ N and fi(w) > fi(w) for some i £ N. 

Let P denote the set of all Pareto-optimal w. The set P* = P D D* is the bargaining 
set and each w £ P* is a cooperative strategy combination. 

An equilibrium is subgame perfect if its restriction to any proper subgame is also an 
equilibrium. Let Eg denote the set of subgame perfect equilibria. 

Facts: 

1. Information sets are constructed so that in making a decision a player knows the 
identity of the information set, but not the particular vertex of the set at which the 
decision is being made. 

2. At an equilibrium w* £ D, no k £ N has a unilateral incentive to depart from (w*) k 
if each j £ N, j ^ k, holds fast to (w*Y . 

3 . w || w k = w. 

4. Different equilibria can yield identical outcomes. 

5. The bargaining set can also be defined with “threat” strategies in lieu of max-min 
(or “security”) strategies as criteria of individual rationality. Context determines which 
definition is apt. 

6. If Li is a singleton, or if all elements of E yield the same outcome (see Example 7), 
then the game is usually regarded as solved. 

7. In general, however, E may either be empty or yield a multiplicity of outcomes (see 
Example 8). 

8. A sufficient condition for E ^ 0 in a finite game is that information be perfect 
(although E need not be computable by all players unless information is also complete). 
The above condition is not necessary; see Examples 7 and 8. 

9. If E yields a multiplicity of outcomes, then an equilibrium selection criterion is nec- 
essary. One criterion is to reduce E to EnP * , thus preferring cooperative equilibria (of 
a noncooperative game) to noncooperative equilibria. Another criterion is to reduce E 
to EC\ Eg- 

10. Rationales for the above criteria are discussed in [Me93]. Other equilibrium selec- 
tion criteria are discussed in [Fr90] and [My91]. 

11 . The equilibrium selection problem is one of the important unsolved problems of 
game theory; see [BiKiTa93]. 

Examples: 

1 . A university (Player 3) must offer a faculty position to either or both of two indi- 
viduals, a distinguished researcher (Player 1) and a younger colleague in the same area 
(Player 2), each of whom can say either YES or NO to an offer but cannot communicate 
with the other. The payoff to Player i = 1,2 (in well-being) is <r» (> 0) for an offer, 
bi (> <7i) for an appointment, and Bi (> b t ) if both are appointed. To the university, 
hiring Player 1 alone is worth 4 (in prestige); but hiring both merits 3, hiring neither 
is worth 2, and hiring Player 2 alone merits zero, because appointing Player 2 prevents 
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the appointment of another distinguished researcher. The university hides from each 
candidate whether it has made an offer to the other. The extensive form of this game 
is shown in the following figure. Each player has a single information set (denoted by 
a rectangle). There are no chance vertices and no proper subgames. The payoffs to 
Players 1, 2, and 3 are indicated by the 3-vector at each terminal vertex of the tree. 




2. Suppose in Example 1 that the university now reveals to whom it has made an offer. 
Also, the university need not offer the position to either candidate this year, in which 
case a single individual is appointed next year and chance decides with equal probability 
which current candidate the appointee matches in caliber, giving the university a payoff 
of 0.5 x 4 + 0.5 x 0 = 2. The extensive form of this game is shown in the following figure. 
Player 1 has information sets /, J whereas Player 2 has information sets K, L. There 
is a single chance vertex. Information sets I , J, K each contain the root of a proper 
subgame. 
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3 . The figures of Examples 1 and 2 are finite games of imperfect information since in 
both cases Player 2 has an information set with more than one vertex. Each game has 
incomplete information if players know only their own terminal payoffs. 

4 . In the figure of Example 2, Player 2 can say YES or NO at each of K or L. Hence 
Player 2 has four possible strategies: YKNL (yes if K , no if L ), NKYL, an unconditional 
YES, and an unconditional NO. Likewise, Player 1 has four strategies: YINJ, NIYJ, 
YES, and NO. 

5 . The following table depicts the strategic form of the figure of Example 1 as a 3- 
dimensional array. The strategy sets are Si = {YES, NO}, S 2 = {YES, NO}, and 
S 3 = {BOTH, 1 ONLY, 2 ONLY}. The payoff function is defined by /i(YES, NO, 1 
ONLY) = 61 , / 2 (NO, NO, 2 ONLY) = cr 2 , / 3 (YES, YES, BOTH) = 3, etc. Player l’s 
strategies correspond to rows, Player 2’s strategies correspond to columns, and Player 3’s 


strategies correspond to arrays. 
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NO 

YES 

NO 

YES 

NO 
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BOTH 


1 ONLY 


2 ONLY 


6 . The following table depicts the strategic form of the figure of Example 2 as a 3- 
dimensional array. Player l’s strategies correspond to rows, Player 2’s strategies corre- 
spond to columns, and Player 3’s strategies correspond to arrays. 

The strategy sets now are Si = {YES, YINJ, NIYJ, NO}, S 2 = {YES, YKNL, 
NKYL, NO}, and S 3 = {BOTH, 1 ONLY, 2 ONLY, NEITHER}. The sets S u S 2 contain 
more strategies than in Example 5 because Players 1 and 2 have better information: 
the game is less imperfect. The payoff to Player 3 from NEITHER is an expectation 
over arcs emanating from the game’s single chance vertex. 

7. {YES, YES, 1 ONLY} and {YES, NO, 1 ONLY} are the equilibria of Example 1; 
both yield the same outcome, namely, Player 1 is hired without an offer to Player 2. 

8 . Example 2 has 14 equilibria: namely, (YINJ, YES, BOTH), (YINJ, NKYL, BOTH), 
and all strategy combinations of the form (YES, • , 1 ONLY), (NIYJ, • , 1 ONLY), or 
(NO, •, NEITHER), where • denotes any of the four strategies of Player 2. Eight of 
these 14 equilibria correspond to the equilibrium outcome of Example 1, whereas the 
other six correspond to two different outcomes. 

9. For Example 1, to 3 (BOTH) = 0 = m 3 ( 2 ONLY) and m 3 ( 1 ONLY) = 2, implying 
w 3 — 1 ONLY and / 3 = 2. For k < 2, ?rife(YES) = 0 = mjt(NO), implying /*, = 0. So 
D* = D- {(YES, YES, 2 ONLY), (NO, YES, 2 ONLY), (NO, YES, BOTH)}. 

10 . For Example 1, P = {(YES, YES, BOTH), (YES, NO, BOTH)} = P*. 

11 . In Example 2, the equilibria (YINJ, YES, BOTH) and (YINJ, NKYL, BOTH) are 
not subgame perfect because in the subgame beginning at J they would require Player 1 
to say NO, which would be irrational. (Player l’s threat to say NO, unless Player 3 
makes an offer to BOTH, is not credible because Player 3 has a first mover advantage.) 

12 . While reducing E to E ft P* eliminates equilibria of the form (NO, • , NEITHER) 
in Example 2, it is also possible that E and P are disjoint (as in Example 1). 


© 2000 by CRC Press LLC 




YES 

YKNL NKYL 

NO 


YES 

YKNL NKYL 

NO 


'Bi~ 


'W 


'Bi 


'61] 


■&r 


'&r 


"&r 


'61' 

YES 

b 2 


02 


b 2 


c 2 

YES 

0 


0 


0 


0 


_ 3 _ 


4 


3 _ 


4 


4 


4 


4 


4 


'Bi~ 


~bi' 


'Bil 


~bi' 


Cl 


Cl 


Cl 


Cl 

YINJ 

b 2 


C 2 


b 2 


C 2 

YINJ 

0 


0 


0 


0 


_ 3 _ 


4 


3 _ 


4 


2 


2 


2 


2 


Cl 


ci 


ci 


Cl 


■&r 


'&r 


'61' 


■&r 

NIYJ 

b 2 


0*2 


b 2 


c 2 

NIYJ 

0 


0 


0 


0 


0 


2 


0 


2 


4 


4 


4 


4 


cri 


04 


Cl 


Cl 


Cl 


Cl 


Cl 


Cl 

NO 

b 2 


C 2 


b 2 


c 2 

NO 

0 


0 


0 


0 


_ 0 _ 


2 


_ 0 _ 


2 


2 


2 


2 


2 



BOTH 





1 ONLY 





YES 

YKNL NKYL 

NO 


YES 

YKNL NKYL 

NO 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 

YES 

b 2 


b 2 


c 2 


c 2 

YES 

0 


0 


0 


0 


_ 0 _ 


_ 0 _ 


2 


2 


2 


2 


2 


2 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 

YINJ 

b 2 


b 2 


c 2 


c 2 

YINJ 

0 


0 


0 


0 


0 


0 


2 


2 


2 


2 


2 


2 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 

NIYJ 

b 2 


b 2 


c 2 


c 2 

NIYJ 

0 


0 


0 


0 


_ 0 _ 


_ 0 _ 


2 


2 


2 


2 


2 


2 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 


' 0 ' 

NO 

b 2 


b 2 


c 2 


c 2 

NO 

0 


0 


0 


0 


0 


0 


2 


2 


2 


2 


2 


2 


2 ONLY NEITHER 


13. While reducing E to EH Eg eliminates equilibria of the form (YINJ, YES, BOTH) 
and (YINJ, NKYL, BOTH) in Example 2 , it is also possible that E = Eg (as in 
Example 1 , where there are no proper subgames). 


15.5.2 MATRIX AND BIMATRIX GAMES 

This subsection discusses two-player noncooperative games. Such games can be repre- 
sented in normal form by a pair of matrices. 

Definitions: 

Suppose Si = { 1 , . . . , r} and S2 = { 1 , . . . , s}. 

The rxs payoff matrices A = (a© and B = ( bij ), with a l:] = and bij = f 2 (i,j), 

define a bimatrix game. 
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The game is zero-sum if aij + bij = 0 for all i £ Si, j £ S 2 . The game is symmetric 
if r = s and B = A T . In either case, the game is completely determined by A and is 
called a matrix game. 

For Player 1, i £ Si is dominated by i' £ S 1 if ay, > a l3 for all j £ S 2 , with strict 
inequality for at least one j. For Player 2, j £ S 2 is dominated by j' £ S 2 if hy > 
for all * £ Si with strict inequality for at least one i. 

Let lfc denote the fc-dimensional vector in which every entry is 1, and let X). denote the 
(k— l)-dimensional unit simplex: Xk = { (aq, . . . ,Xk) \ xl k = 1, x > 0 }. 

A mixed strategy for Player 1 is a vector p = (pi, . . . ,p r ) £ X r , where pi is the 
probability that Player 1 selects i £ Si. Similarly, a mixed strategy for Player 2 is 
q = (qi , . . . , q s ) £ X s , where q 7 is the probability that Player 2 selects j £ S 2 . 

In a mixed strategy combination ( p,q ) £ X r x X s , the expected payoffs to Players 
1 and 2, respectively, are given by <p\ (p, q) = pAq T and 02 (p, q) = pBq T . 

The pair ( p*,q *) £ X r x X s is a Nash equilibrium mixed strategy combination , 

or simply an equilibrium in mixed strategies , if 01 {p*,q*) > <i>i(p,q*) for all p £ X r 
and 02 (p*, q*) > 02 {p* , q) for all q £ X s . If the game is zero-sum, then p* is called an 
optimal strategy for Player 1 and q* is called an optimal strategy for Player 2. 

Facts: 

1. Every bimatrix game has at least one equilibrium in mixed strategies. 

2. All equilibria in mixed strategies of a zero-sum game yield the same expected payoffs, 
v to Player 1 and — v to Player 2; v is known as the value of the game. 

3. The value v of a zero-sum game and a pair {p* , q*) of optimal strategies can always 
be computed by solving a dual pair of linear programming (LP) problems (§15.1). The 
primal LP problem finds p to maximize v subject to A T p > vl s , p £ X r , whereas the 
dual LP problem finds q to minimize v subject to Aq < vl r , q £ X s . 

4. Player 1 can achieve the value v of a zero-sum game with a mixed strategy that 
attaches zero probability to any dominated pure strategy. Likewise, Player 2 can 
achieve — v by playing dominated pure strategies with zero probability. 

5. Graphical methods can be used to compute efficiently all equilibria of zero-sum 
games where r = 2 or s = 2, or of matrix games (of either type) where r = s = 3; see 
[Dr81], [Ow95], and [Me92]. There is no general method for computing all equilibria. 

6. The definition of mixed strategy and the existence of equilibria are readily extended 
to n-player games. This result was one of the fundamental contributions to game theory 
for which John Nash was awarded the 1994 Nobel Prize in Economic Science. 

Examples: 

1. Two advertising agencies are involved in a campaign to promote competing bever- 
ages. The payoffs of various promotional strategies are shown in this table: 



3 

1 

2 

i 


old 

new 

1 

old 

0 

-2 

2 

new 

-2 

-1 

3 

diet 

3 

-3 


The promotional strategies for the first agency are to: stress the old formula, advertise 
a new formula, or advertise a diet drink. The second agency has the possible strategies: 
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stress the old formula, or advertise a new formula. The payoffs in this case indicate the 
net change in millions of sales gained (by Advertiser 1). For example, if the first agency 
promotes a diet drink while the other agency promotes the old formula, three million 
more drinks will be sold. On the other hand, if the other agency happens to promote 
the new formula, then the first agency will end up losing three million unit sales to the 
second agency. 

2. An investor has just taken possession of jewels worth $45,000 and must store them 
for the night in one of two locations (A, B). The safe in location A is relatively secure, 
with a probability A of being opened by a thief. The safe at location B is not as secure 
as the safe at location A, and has a probability | of being opened. A notorious thief 
is aware of the jewels, but doesn’t know where they will be stored. Nor is it possible 
for the thief to visit both locations in one evening. This is a (symmetric) zero-sum 
game between the investor (Player 1), who selects where to keep the jewels and the 
thief (Player 2), who decides which safe to try. If the investor puts the jewels in the 
most secure location (A) and the jewel thief goes to this location, the expected loss in 
this case is -^(—45,000) + y§(0) = —3,000. The other entries of the payoff matrix in 
the following table are computed similarly, and are expressed in thousands of dollars 
(to the investor). 

3 1 2 

i A B 

1 A -3 0 

2 B 0 -9 

No pure strategy combination is a Nash equilibrium, since it is always tempting for one 
player to defect from the current strategy. However, there is a Nash equilibrium mixed 
strategy combination: p* = (|, |) = q * , with value v = —$2,250 to the investor. The 
mixed strategy p* is found by solving the following linear program, in which Player 1 
wants to find the largest value of v so that he is guaranteed of receiving at least v 
(regardless of what Player 2 does). The associated optimal dual LP solution gives q*. 

maximize: v 

subject to: — 3pi + 0 p 2 > v 
Opi — 9p2 > v 
Pi + P‘2 = 1 
Pl,P2 > 0 

3. The zero-sum game of chump is played between two camels, a dromedary (Player 1) 
and a bactrian (Player 2). Player k must simultaneously flash humps and guess 
that its opponent will flash G fc . Possible strategies ( F k , Gj, ) satisfy 0 < i© G 2 < 1 and 
0 < F 2 ,Gi < 2. If both players are right or wrong, then the game is a draw; if one is 
wrong and the other is right, then the first pays Fj + F 2 piasters to the second. The 
following table shows the strategy sets and corresponding payoffs a,, to Player 1. 

j 1 2 3 4 5 


i 


( 0 , 0 ) 

( 0 , 1 ) 

( 1 , 0 ) 

( 1 , 1 ) 

( 2 , 0 ) 

( 2 , 1 ) 


( 0 , 0 ) 

0 

0 

-1 

0 

-2 

0 

1 

( 0 , 1 ) 

0 

0 

0 

1 

-2 

0 

2 

( 0 , 2 ) 

0 

0 

-1 

0 

0 

2 

3 

( 1 , 0 ) 

1 

0 

0 

-1 

0 

-3 

4 

( 1 , 1 ) 

0 

-1 

2 

0 

0 

-3 

5 

( 1 , 2 ) 

0 

-1 

0 

-2 

3 

0 
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The first row and column can be deleted from the full payoff matrix because (0,0) is 
dominated by (0,1) for both players (Fact 4). Thus it suffices to analyze the reduced 
payoff matrix in which r = s = 5. The value of the game is — ^ for Player 1 (Jg 
for Player 2). Optimal strategies p* = (^, §§, Jg, ^,0) and q* = (f, f,0, ^, ^) are 
found by linear programming (Fact 3). Note that strategies i = 5 and j = 3 have zero 
probability at this equilibrium, despite being undominated. 

4. The symmetric game of four ways [Me92] is played by two left-turning motorists 
who arrive simultaneously from opposite directions at a 4-way junction. Each has three 
pure strategies: the first is to go, the second to wait, and the third a conditional strategy 
of going only if the other appears to be waiting. It takes 2 seconds for one motorist 
to cross the junction while the other waits. If initially both either go or wait, then 
both motorists incur an extra “posturing” delay of either 3 or 2 seconds, respectively. 
Also, the one who ultimately waits is equally likely to be either player. For example, 
an = 0.5 x (-3 - 2) + 0.5 x (-3) = - 4 and a 22 = 0.5 x (-2 - 2) + 0.5 x (-2) = -3. 
This game has the payoff matrix 

/ —4 0 0\ 

-2 -3 -2 . 

\-2 0-4/ 

There are infinitely many equilibria in mixed strategies; these are described in the 
following table, where 0 < a < 1 and | < 6 < 1. 


p* 

* 

q 

(1,0,0) 

(0, a, 1 — a) 
(0,1,0) 
(6,0,1 - b) 
Ti(6,2,3) 

(0, a, 1 — a) 
(1,0,0) 
(6,0,1 - 6) 
(0,1,0) 

n( 6 > 2 > 3 ) 


15.5.3 CHARACTERISTIC-FUNCTION GAMES 

When there exists a binding agreement among all players to cooperate, attention shifts 
from strategies to the bargaining strengths of coalitions. These strengths are assumed 
to be measured in terms of a freely transferable benefit (e.g., money or time) and players 
are assumed to seek a fair distribution of the total benefit available. Also, without loss 
of generality, the benefit of cooperation will be taken as the savings in costs. 

Definitions: 

A coalition is a subset S of N = {1, . . . , n}; equivalently, S £ 2 N . 

The cost associated with coalition S is denoted c(S). 

Let TZ + denote the set of nonnegative reals. The characteristic function v: 2 N — > 1Z + 
assigns to each S its cooperative benefit, using v(S) = max{0, c({?’}) ~ c(S)}. 

ieS 

A characteristic- function game, or c-game, is the pair i’ = (TV, V). 

The game T is inessential if V(N) = 0. If V(N) > 0 then the game is essential, with 
normalized characteristic function w. 2 ;V — > [0,1] defined by v{S) = =jjy ■ 
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The game T is convex if v(S U T) > i '(S) + v(T) — v{S fl T) for all S,T £ 2 N . 

Let X = X n be the (n— l)-dimensional unit simplex (§15.5.2). Any x £ X is called an 
imputation ; it allocates Xi of the total normalized benefit v{N) = 1 to Player i. An 
imputation is unreasonable if it allocates more to some i £ N than the maximum that 
i could contribute to any coalition T — {z} by joining it. 

The reasonable set is X R $ = { x £ X | Xi < ma x[v(T) — v{T — {z})] for all i € N }. 

For any x € X and S £ 2 N , the excess of coalition S' at a; is e(S, x) = v{S) — Yj x i ■ 

i(zS 

The core ofTisC = {:r€X| e(S, x) < 0 for all S £ 2 N } . 

The marginal worth of Player i to the coalition T — {z} is v(T) — v(T — {*}). 

The Shapley value of a c-game is the imputation x s = (xf, xf , . . . , x®) defined by 
x i=h E (\ T \ - !)! (« - 1^1)! ( V (T) - v(T - {*})), where n‘ = { T £ 2 N \ T D {i} }. 

Teir 

Facts: 

1. An imputation is both individually rational and group rational (see §15.5.1). 

2. Convexity is a sufficient (but not necessary) condition for the core to exist. 

3. If C ± 0 then C C X RS . 

4. If C contains a single imputation, then the c-game is usually regarded as solved. 

5. In general, C may either be empty (see Example 1) or contain infinitely many 
imputations (see Example 2). 

6. If C contains infinitely many imputations, then there are several ways to single one 
out as the solution to the c-game. One approach is to define a “center” of C, which 
leads to the important concept of the nucleolus [Me92] . 

7. Every c-game solution concept assumes that players have agreed to enter coalition N. 
If its order of formation were known, players could be allocated their marginal worths; 
in general, however, this order of formation (and hence marginal worth) is a random 
variable. 

8. If all orders of formation of N are equally likely, then the probability that Player i 

enters N by joining the coalition T — {*} is ■ 

9. The Shapley value distinguishes a single imputation as the solution of a c-game 
by allocating to players the expected values of their marginal worths, based on the 
assumption that all orders of formation of N are equally likely. 

10. x s £X rs . 

11. x s £ C if T is convex. 

Examples: 

1. In the c-game log-hauling [Me92], three lone drivers of pickup trucks discover a pile 
of 150 logs too heavy for any one to lift. Players 1, 2, and 3 can haul up to 45, 60, 
and 75 logs, respectively. Thus F({1,2}) = 105, F({1,3}) = 120, z/({2,3}) = 135, and 
F({1,2,3}) = 150 so that u({l,2}) = z/({l,3}) = and z/({2,3}) = This 

c-game is not convex; for example, if S = {1,2} and T = {2,3}, then 1 = v(S U T) < 
v{S) + v(T) - u{S n T) = §. Also, C = 0. 
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2. The c-game car pool [Me92] is played by three co-workers whose office is d miles 
from their residential neighborhood, shown in the following figure. 


u 





J 

Th 







► 


d miles 


Driving to work costs $fc per mile, and the shortest route is always used. The benefit 
of cooperation is car pool savings, leading to the characteristic function in this table. 


s 

c(S) 

v{S) 

v{S) 

e(5, x) for d = 1 

0 

0 

0 

0 

0 

{ 1 } 

(4 + d)k 

0 

0 

-Xl 

{2} 

(3 + d)k 

0 

0 

-x 2 

{3} 

(3 + d)k 

0 

0 

X\ + x 2 — 1 

{1,2} 

(4 + d)k 

(3 + d)k 

3~| -d 

3+2 d 

H 

to 

{1,3} 

(6 + d)k 

(1 + d)k 

1+d 

3+2 d 

*2-| 

{2,3} 

(6 + d)k 

dk 

d 

3+2 d 

*1-5 

{1,2,3} 

(7 + d)k 

(3 + 2 d)k 

1 

0 


Because X 3 — 1 — x\ — x 2 (> 0), a set of imputations is determined by its projection 
onto X 3 = 0. In these terms, for d = 1, X is the largest triangle in the following figure, 
Xus is the shaded hexagon, and C is the shaded quadrilateral. Here C C Xrs C X 
because the c-game is convex (for all d > 0). 
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3 . For the c-game in Example 2, it is easy enough to locate a center for C\ see the 
figure for Example 2, where the nucleolus is marked by a dot. 

4 . In Example 1, the six possible orders of formation of N are 123, 132, 213, 231, 312, 
321. Thus, the Shapley value is the imputation x s = (11, |, ||); see the following table. 


i 

Teff 

v(T) - v{T - {i}) 

probability i enters N 
by joining T - {i} 

^ i 


{1} 

0 

1 

3 


i 

{1,2} 

7 

10 

1 

6 

17 

{1,3} 

4 

5 

1 

6 

60 


{1,2,3} 

1 

10 

1 

3 



{2} 

0 

1 

3 


2 

{1,2} 

7 

10 

1 

6 

1 


{2,3} 

9 

10 

1 

6 

3 


{1,2,3} 

1 

5 

1 

3 



{3} 

0 

1 

3 


3 

{1,3} 

{2,3} 

4 

5 

9 

10 

1 

6 

1 

6 

23 

60 


{1,2,3} 

3 

10 

1 

3 



5 . By a calculation very similar to that laid out in the table of Example 4, the Shapley 
value for Example 2 is the imputation x s = (ygi g)- Because the c-game is convex, 
x s £ C . This is illustrated in the previous figure, where x s is marked by a cross. 


15.5.4 APPLICATIONS 

Discrete (noncooperative or characteristic-function) games have numerous applications 
and merge with other categories of games not examined here. The references in the 
following table provide sources for the definitions, concepts, and applications of such 
games. This table also lists some representative areas of application of game theory. 


15.6 SPERNER’S LEMMA AND FIXED POINTS 

A fixed point of a function from a set X to itself is a point of X that is mapped into 
itself. Brouwer (1912) proved that every continuous mapping / on the unit ball has a 
fixed point. Sperner (1928) gave an elegant proof of Brouwer’s fixed-point theorem using 
a combinatorial lemma known today as Sperner ’s lemma. This lemma has a number of 
applications to economics, nonlinear programming, and game theory. 
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category and references 

selected applications 

remarks 

characteristic function 
games [Me92,93], [Ow95], 
[Wa88] 

airport landing fees, 
voting, water resour- 
ces 

utility is usually assumed to be 
transferable: in essence, play- 
ers value benefits identically 

classical game theory 
[LuRa57], [voMo53] 

microeconomics, par- 
lor games 

economic (as opposed to evolu- 
tionary) game theory 

combinatorial games 
[Gu91] 

chess, go, nim, other 
parlor games 

two players; complete, perfect 
information; no chance moves; 

zero-sum 

continuous games [Dr81], 
[Fr90] 

duels, military com- 
bat, oligopoly theory 

a discrete game with mixed 
strategies is a special case of 
a continuous game 

cooperative games in stra- 
tegic form (as opposed to 
c-games) [Fr90], [Me92] 

wage bargaining, 
motoring behavior 

agreements among players are 
binding 

differential games 
[BaHa94], [Me93] 

fishery and forest 
management 

extension of optimal control 
theory 

economic game theory 
[Fr90], [My91] 

microeconomics 

equilibria are the result of ra- 
tional thought processes 

evolutionary game theory 
[Cr92], [Ma82], [Me92] ' 

animal behavior 

equilibria are the result of nat- 
ural selection or equivalent 
populational processes 

iterated games [Fr90], 

[Me 9 2] 

rationality of cooper- 
ation 

often infinitely many iterations 

resource games [Me93] 

fisheries, forestry, wa- 
ter resources 

discrete, continuous, and differ- 
erential games all used 

symmetric matrix games 
[Cr92] , [Ma82], [Me92] 

evolutionary game 
theory 

dynamical systems theory pro- 
vides a rationale for strategic 
equilibrium 

zero-sum matrix games 
[Dr81] 

military science 



15.6.1 SPERNER’S LEMMA 

Sperner’s lemma is a combinatorial result applicable to certain triangulations of a p- 
dimensional convex set, in which the vertices of the triangulation are given labels from 

Definitions: 

The p+1 points X\,X 2 , ■ ■ ■ , x p+ \ £ lZ n are said to be in general position if the vectors 
X 2 —X 1 , X 3 —X 1 , . . . ,x p + \—x\ are linearly independent (§6.1.3). 

The set C C lZ n is convex if for all x,y £ C and 0 < A < 1, Xx + (1 — A )y £ C. 
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The convex hull of a finite set of points Vi , . . . , v p+ i € lZ n is the set (i>i, . . . , v p+ \ ) = 

p+i p+i 

{ Y A iVi | X] A,; = 1, Aj > 0 }. 

i= 1 i= 1 

A p-simplex a is the convex hull of p + 1 points xi, . . . , x p +i G lZ n in general position. 

The vertices of the p-simplex a = (xi, . . . , x p _|_i) are the points xi,...,x p +i. The 
face r = (xj 1 , . . . , Xj k } of cr is the simplex spanned by the subset {xj 1} . . . ,Xj k } of 
{xi, . . . , x p+ i}. Write t < cr when r is a face of a. 

A simplicial complex K is a collection of simplices satisfying: 

• if u G I\ and r -< cr then r € K; 

• if er, r G AT intersect, their intersection is a face of each. 

The p-skeleton of a simplicial complex K is the set of all simplices of dimension p or 
less. The 0-skeleton is the vertex set , denoted V(K). 

A simplicial subdivision T of a simplex a is a collection of simplices { Tj \ 1 < j < m } 
satisfying: 

U m 

j= 1^-; 

• the intersection of any two Tj is either empty or a face of each. 

A simplicial subdivision T' of a simplicial complex /C is a refinement of the simplicial 
subdivision T of K, if every simplex of T is a union of simplices of T' . 

Given a simplicial subdivision T of the p-simplex a = (xi, . . . , x p +i), a proper labeling 
of T is a mapping f: V {T) —> {1, 2, . . . , p + 1} satisfying: 

• ^(x m ) = m for m = 1, . . . , p + 1; 

• if vertex v lies on a face (x^ , ■ ■ ■ , Xk q ) of cr, then i(v) G {fci , . . . , k q }. 

Here {1, 2, . . . ,p + 1} is the label set , and if £(u) = k then v receives the label k. 

A distinguished simplex is a p-simplex that receives all p+ 1 labels 1 through p+ 1. 

Facts: 

1. The convex hull (t>i, . . . , u p +i) is the intersection of all convex sets containing the 
points Vi,. . . , v p - |_i. 

2. The dimension of any p-simplex is p. 

3. A p-simplex contains 2 P+1 — 1 simplices of dimension p or less. 

4. Sperner’s lemma (1928): Every properly labeled subdivision of a simplex a has an 
odd number of distinguished simplices. (E. Sperner, 1906-1980) 

5. Algorithm 1 gives a method for finding a distinguished triangle in a properly labeled 
subdivision of a triangle T . Each iteration of the outer loop starts at a distinguished 
1-simplex and traces out a path, terminating either at a distinguished 2-simplex or at 
an outer edge of T. 

6. Since there are an odd number of distinguished 1-simplices along the bottom of T 
(Fact 4) and since each “failed” outer loop iteration produces a path joining two such 
distinguished 1-simplices, Algorithm 1 must eventually produce a path terminating at 
a distinguished 2-simplex. 


© 2000 by CRC Press LLC 



Algorithm 1: Distinguished simplex of a 2-simplex. 

input: properly labeled subdivision of triangle T 
output: a distinguished triangle of T 

{Outer loop} 

find a distinguished 1-simplex r along the bottom of T 
{Inner loop} 

repeat 

if the unique triangle containing r is distinguished then stop 
else proceed to a neighboring triangle whose common edge is distinguished 
until either a distinguished triangle is found or the search leads to the bottom 
edge of T 

continue outer loop with a new distinguished 1 -simplex r 


Examples: 

1. A 0-simplex is a point, a 1-simplex is a line segment, and a 2-simplex is a triangle 
(interior included). A 3-simplex includes the vertices, edges, faces, and interior of a 
tetrahedron. See the following figure. 



2. The 0-skeleton of a simplex cr is its vertex set; the 1-skeleton of a is the edge set 
of a including their endpoints; if cr is a 3-simplex, the 2-skeleton is the union of the 
faces of cr. 


3. Part (a) of the following figure shows a simplicial subdivision of a 2-simplex. The 
subdivision in part (b) of the figure is not simplicial because n D T 3 is not a face of the 
simplex 73. 




4. The following figure shows a proper labeling of a simplicial subdivision of a 1-simplex. 
A distinguished 1-simplex is a subinterval that receives both the labels 1 and 2. In this 
example, there are five such 1-simplices, an odd number (as guaranteed by Fact 4). 


12 112 2 12 
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5. The following figure shows a proper labeling of a simplicial subdivision of a 2-simplex. 
There is one distinguished 2-simplex, receiving all three labels, which is shown shaded 
in the figure. If the vertex in the interior of the triangle is instead labeled 3, then there 
will be three distinguished 2-simplices, still an odd number. 



6. Several possible paths from executing Algorithm 1 are displayed in the following 
figure. The rightmost path terminates in a bottom edge, while the leftmost path leads 
to a distinguished triangle. Note that there are three distinguished triangles in this 
example, an odd number as required by Sperner’s lemma. 



15.6.2 FIXED-POINT THEOREMS 

Fixed-point theorems have applicability to a number of problems in economics, as well 
as to game theory and optimization. 

Definitions: 

The point x £ B is a fixed point of the mapping if fix ) = x. 

The mapping / defined on a subset A of a normed space B is a contraction if there 
is some 0 < (3 < 1 such that ||/(x) — f(y ) || < (3\\x — y\\ for all x,y £ X. 

The function F is a set mapping on X if F( x) is a nonempty subset of X for all 
x £ X. 

The set mapping F is convex if F(x) is a convex subset of X for all x £ X. 

Facts: 

1. Fixed-point theorems can be used to demonstrate the existence of economic equi- 
libria, solutions to a system of nonlinear equations, and Nash equilibria in two-person 
nonzero-sum games. 
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Algorithm 2: Fixed point of a p-simplex. 

input: function / defined on a p-simplex cr 
output: fixed point x* £ a 

construct a sequence of subdivisions { T n \ n > 1 } such that F n +i refines T n 

label the vertex set of T n as in Fact 5 

for each subdivision T n find a distinguished simplex 

p| t„ contains the desired fixed point x* 


2. Brouwer fixed-point theorem I: Every continuous mapping /: a — > cr where cr is a 
p-simplex has a fixed point. (L. E. J Brouwer, 1881-1966) 

3. For a simplex cr = (aq, # 2 , • • • , £ p +i} and a continuous mapping f:a — > cr, let 
/(Efetl = E£1 hkx k . Then Eg! M = E£1 A fc = 1. 

4. Relative to the mapping /: cr — ■> cr, define Tj = { Efcii Afcaq | Pj < A 7 - }. Then a 
fixed point of / is any point belonging to f] { j=i E- 

5. Suppose an interior vertex v of a subdivision T of cr is labeled with j provided that 
v £ Tj, and suppose a vertex v belonging to a face (x kl , x k2 , . . . , x kt ) is labeled with any 
one of the labels fci, . . . , Ay. Then a fixed point of / occurs in a distinguished simplex 
of cr. 

6. Algorithm 2, based on Sperner’s lemma (§15.6.1), produces a sequence of points 
converging to a fixed point of a p-simplex cr. 

7. Brouwer fixed-point theorem II: Every continuous mapping from a convex compact 
set B C TZ n into itself has a fixed point. 

8. Contraction mapping theorem: Every contraction /: X — > A has a fixed point. 
The fixed point is the limit of the sequence { /( x n ) \ n > 0 }, where Xo is an arbitrary 
element of X and a; n +i = f(x n ). 

9. Kakutani fixed-point theorem: Let X C 7 Z n be a convex and compact set and 
suppose that F is a convex mapping on X. If the graph { (x, y) \ y £ F(x) } C TZ 2n is 
closed, then there exists a point x* £ X such that x* £ F(x*). 

10. Schauder fixed-point theorem: Every continuous mapping / on a convex compact 
subset X in a normed space B has a fixed point. 

11. Reference [Bo85] gives applications of fixed-point theorems to determining market 
equilibria, maximal elements of binary relations, solutions to complementarity problems, 
as well as solutions to various types of games (cooperative and noncooperative). 

Examples: 

1. The real- valued function f(x ) = 1 — x is a mapping from the 1-simplex cr = [0,1] 
to itself. It is not a contraction since | f(x) — f(y) \ = |(1 — a;) — (1 — y)\ = 1 • \x — y\ 
holds for all x, y £ a so (3 > 1. The function / has a fixed point at x = ’ . However, 
the iterative procedure in Fact 8 will not generally locate this fixed point. For example, 
using x 0 = \ produces the sequence x\ = f(x 0 ) = |, x 2 = /( aq) = x 3 = /( x 2 ) = f , 
and so forth, with no limiting value. 
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2. The following figure shows a real-valued function f:a — > a defined over the 1-simplex 
< 7 = [0, 1], This function has three fixed points, identified by the intersection of the graph 
of / with the dashed line y = x. The sets Tj of Fact 4 relative to x\ = 0, X 2 = 1 are 
also indicated in the figure, and it is verified that Ti flT 2 contains the three fixed points 
of /. A subdivision of a into five subintervals is shown in the figure. Using Fact 5, 
the associated vertices (at x = 0.0, 0.2, 0.4, 0.6, 0.8, 1.0) receive the labels 1,2, 1,1,2, 2 
respectively, and so there are three distinguished simplices (each containing a fixed 
point). 



3. The real-valued function f(x) = is a mapping from TZ to itself. It can also be 

shown to be a contraction mapping with /3 = <1. If the iterative procedure in Fact 8 

is applied using xo = 1, then X\ = 0.2, X 2 = 0.24752, x$ — 0.24623, X 4 = 0.24627, and 
X 5 = 0.24627, yielding the (approximate) fixed point x* = 0.24627. 

4. Perron’s theorem: This theorem (§6.5.5), which assures that every positive matrix 

has a positive eigenvalue-eigenvector pair, can be proved using the fixed-point theorem 
in Fact 2. Let A = (cijj) be an n x n matrix, with all a,j > 0. The set <7 = { x € 7 Z n \ 
Dfc=i*fc = 1) %k > 0 } is an (n— l)-simplex, and the continuous function defined by 
f(x) = maps cr into itself. (Here ||tu||i is the l\ norm of vector w; see §6.4.5.) By 

Fact 2, / has a fixed point x, so that Ax = ||Ax||iX. Since at least one component of x 
is positive and A is positive, the vector Ax has positive components. It then follows 
that the eigenvalue ||Ax||i is positive and that the corresponding eigenvector x has all 
positive components. 
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INTRODUCTION 


Theoretical computer science is concerned with modeling computational problems and 
solving them algorithmically. It strives to distinguish what can be computed from what 
cannot. If a problem can be solved by an algorithm, it is important to know the amount 
of space and time needed. 


GLOSSARY 

abelian square: a word having the pattern xx p , where x p is any permutation of the 
word x. 

acceptance probability (of an input word by a probabilistic TM): the sum of the 
probabilities over all acceptance paths of computation. 

Ackermann function : a very rapidly growing function that is recursive, but not 
primitive recursive. 

algorithm: a finite list of instructions that is supposed to accomplish a specified com- 
putation or other task. 

alphabet : a finite nonempty set whose elements are called symbols. 

ambiguous context-free grammar: a grammar whose language has a string having 
two different leftmost derivations. 

analysis of an algorithm: an estimation of its cost of execution, especially of its 
running time. 

antecedent of a production a — » (3: the string a that precedes the arrow. 

average-case running time: the expected running time of an algorithm, usually 
expressed asymptotically in terms of the input size. 

Backus-Naur (or Backus normal) form (. BNF ): a metalanguage for specifying 
computer language syntax. 

busy beaver function: the function BB(n ) whose value is the maximum number 
of Is that an n-state Turing machine can print and still halt. 

busy beaver machine, n-state: an n-state Turing machine on the alphabet £ = 
{#,1} that accepts an input tape filled with blanks (#s) and halts after placing a 
maximum number of Is on the tape. 

cellular automaton, ( n-dimensional ): an interconnection network in which there 
is a processor at each integer lattice point of n-dimensional Euclidean space, and 
each processor communicates with its immediate neighbors. 

characteristic function (of a language): the function on strings in the alphabet for 
that language that has value yes for elements in the language, and no otherwise. 

characteristic function (of a set): the function whose value is 1 for elements of the 
set, and 0 otherwise. 

Chomsky hierarchy : four classes of grammars, with gradually increasing restrictions. 

Chomsky normal form (for a production rule) : the form A — > BC where B and C 
are nonterminals or the form A — ► a where a is a terminal. 
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Church’s thesis (or the Church-Turing thesis ): the premise that the intuitive 
notion of what is computable or partially computable should be formally defined as 
computable by a Turing machine. 

code (for an alphabet V): a nonempty language C C V + , such that whenever a word w 
in V can be written as a catenation of words in C, the write-up is always unique. 
That is, if w = x\ . . . x m = yi . . . y n , where m, n > 1, and yj € C, then m = n 
and Xi = yt for i = 1 ,m. 

code indicator (of a language): the sum of the code indicators of all words in the 
language. 

code indicator (of a word w £ V*): the number ci(w) = |y| _ l u 'L 

collapse (of the polynomial hierarchy to the itli rank): the circumstance that PH = E(', 
for some i > 0. 

common PRAM (or CRCW com ): a CRCW PRAM model in which concurrent 
writes to the same location are permitted if all processors are trying to write the 
same data. 

comparison sort: a sorting algorithm that uses only comparisons between record keys 
to determine the sorted order. 

complement (of a language L over an alphabet V): the language L, where comple- 
mentation is taken with respect to V*. 

C-complete language (where C is a class of languages): a language A such that A is 
C-hard and A € C. 

complexity (of an algorithm): an asymptotic measure of the number of operations or 
the running time needed for a complete execution; sometimes, a measure of the total 
amount of computational space needed. 

complexity (of a function) : usually, the minimum complexity of any algorithm repre- 
senting the function; sometimes, the length or complicatedness of the list of instruc- 
tions. 

complexity (of a function), Kolmogorov-Chaitin type: a measure of the minimum 
complicatedness of any algorithm representing the function, usually according to 
number of instructions in the algorithm (and not related to its running time). 

complexity class coNP: the class n^, which contains every language A such that 

A £ £?. 

complexity class NP: the minimal class that contains every language that is nonde- 
terministically TM-decidable in polynomial time. 

complexity class P: the class comprising every language that is deterministically 
TM-decidable in polynomial time. 

complexity class PSPACE: the minimal class that contains every language that is 
TM-decidable in polynomial space. 

concatenation (of two languages L\ and L 2 ): the set {xy \ x £ L\, y £ L 2 }, de- 
noted LiL, 2 - 

concatenation (of two strings): the result of appending the second string to the right 
end of the first. 

consequent (of a production a — > (3): the string (3 that follows the arrow. 

context-free (or type 2) grammar: a grammar in which the antecedent a of each 
production a — + (3 is a nonterminal. 
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context-sensitive (or type 1) grammar G = (N,T, S, P): a grammar such that 
every production a — > /? (except possibly S — * A) has the form a = uAv and 
f3 = uxv, for u, v £ (N U T)*,A € N, x € (N U T) + . 

CRCW concurrent read concurrent write: a PRAM model in which concurrent 
reads from and concurrent writes to the same location are both allowed. 

CREW concurrent read exclusive write: a PRAM model in which concurrent 
reads are allowed, but not concurrent writes to the same location. 

cube (over an alphabet): a word having the pattern xxx. 

derivation (of the string y from the string x ): a sequence of substitutions, according 
to the production rules, that transforms string y into string z. The notation x =>* y 
means that such a derivation exists. 

emptiness problem for grammars: deciding whether the language generated by a 
grammar is empty. 

empty string: the string of length zero, that is, the string with no symbols; often 
written A. 

equivalence problem for grammars: deciding whether two grammars are equiva- 
lent. 

equivalent automata: two automata that accept the same language. 

equivalent grammars : grammars that generate the same language. 

EREW exclusive read exclusive write: a PRAM model in which concurrent reads 
from and concurrent writes to the same location are not allowed. 

existential lower bound (for an algorithm): a lower bound for its number of execu- 
tion steps that holds for at least one input. 

existential lower bound (for a problem): a lower bound for every algorithm that 
could solve that problem. 

finite automaton: either a finite state recognizer or a nondeterministic finite state 
recognizer. 

finite-state recognizer ( FSR ): a model of a computer for deciding membership in a 
set. 

finite-state machine: a finite automaton or a finite transducer. 

finite-state machine with output : is another name for a finite transducer. 

finiteness problem (for a grammar): deciding whether the language generated by 
that grammar is finite. 

finite transducer: a model of a computer for calculating a function, like an FSR, 
except that it also produces an output string each time it reads an input symbol. 

free monoid (generated by an alphabet): the set of all strings composable from sym- 
bols in the alphabet, with the semigroup operation of string concatenation. 

frequency (of a symbol in a string): the number of occurrences of the symbol in the 
string. 

Game of Life: a 2-dimensional cellular automaton designed by John H. Conway. 

Godel numbering (of a set): a method for encoding Turing machines as products of 
prime powers; more generally, a similar one-to-one recursive function on an arbitrary 
set whose image in Af is a recursive set. 
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grammar: a quadruple G = ( N,T,S,P ), where TV is a finite nonempty alphabet of 
nonterminals , T is a finite nonempty alphabet of terminals , with TVnT = 0,5'isa 
nonterminal called the start symbol , and P is a finite set of production rules of the 
form a —* j3. 

halting problem: the problem of designing an algorithm capable of deciding which 
computations P(x) halt and which do not, where P is a computer program (or a 
Turing machine), and a; is a possible input. 

C-hard language ( C a class of languages) : a language A such that every language in 
class C is polynomial-time reducible to A. 

Hilbert’s tenth problem: the (recursively unsolvable) problem of deciding for an 
arbitrary multivariate polynomial equation p(x \, . . . , x n ) = 0 whether there exists a 
solution consisting of integers. 

inclusion problem for grammars: deciding whether one language is included in 
another. 

inherently ambiguous context-free language: a context-free language such that 
every context-free grammar for the language is ambiguous. 

input size: the quantity of data supplied as input to a computation. 

interconnection network model : a parallel computation model as a digraph in 
which each vertex represents a processor, and in each phase of the computational 
process, each processor communicates with its neighbors and makes a computation. 

inverse (of a morphism h:V * — >U*): the mapping h -1 :! J* — >2 V defined by h~ 1 (x ) = 
{yeV*\h(y) = x},xeU*. 

Kleene closure (or Kleene star ) of a language L: the set of all iterated concate- 
nations of zero or more words in L, denoted L* . 

language (accepted by a machine, such as an FSR, a pushdown automaton, or a Turing 
machine): the set of all accepted strings. 

language (generated by the grammar G): the language L(G) = { x € T* \ S =>* x } 
of words consisting of terminal symbols derivable from the starting symbol. 

language (over an alphabet V): a subset of the free monoid V*. 

Las Vegas algorithm: an algorithm that always produces correct output, whose run- 
ning time is a random variable. 

Las Vegas to Monte Carlo transformation: the Monte Carlo algorithm obtained 
by running the Las Vegas scheme for kE[T\ steps and halting, where E[T\ is the 
expected Las Vegas running time. 

leftmost derivation x =>i e ft V- a derivation x => y in which at each step the 
leftmost nonterminal is replaced. 

leftmost language (generated by the grammar G): the language Li e f t (G) of strings 
of terminals with leftmost derivations from the start symbol S. 

length set (of a language L): the set { |cc| | x € L }. 

length-increasing (or type 1) grammar: a grammar in which the consequent (3 of 
each production a— >/3 (except S— >A, if present) is at least as long as its antecedent a. 

linear grammar: a context-free grammar in which each production a — >/3 has a £ N 
and /3 G T* U T* NT*. 
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Mealy machine: a finite transducer whose output function always produces a single 
symbol. 

membership problem (for a grammar G ): given an arbitrary string x, deciding 
whether x £ L(G). 

mirror image (of a language L): the language mi(L) = { x R \ x £ L} obtained by 
reversing every string in L. 

Monte Carlo algorithm: an algorithm that has a bounded number of computational 
steps and might produce incorrect output with some low probability. 

Monte Carlo to Las Vegas transformation: the Las Vegas algorithm of repeatedly 
running that Monte Carlo algorithm until correct output occurs. 

Moore machine: a Mealy machine such that for every state k and every pair of input 
symbols si and S 2 , the outputs r(fc,s i) and r(fc, S 2 ) are the same. 

morphism (from the alphabet V to the alphabet U): a function s:V — >2 U with s(a) 
a singleton set for all symbols a £ V. 

nondeterministic finite-state recognizer ( NDFSR ): a model like a finite state 
recognizer, but there may be several different states to which a transition is possible, 
instead of only one. 

nondeterministic polynomial-time computation on a TM: a computation for 
which there exists a polynomial function p(n) such that for any input of size n there 
is a computational path on the TM whose length is at most p(n) steps. 

nondeterministic Turing machine: a 5-tuple M = (K, s, h, E, A) otherwise like 
a deterministic Turing machine, except that the transition function A maps each 
state-symbol pair ( q , b ) to a set of state-symbol-direction triples. 

nonterminal (in a grammar): a symbol that may be replaced when a production is 
applied. 

nontrivial family of languages: a family that contains at least one language different 
from 0 and {A}. 

NP-complete language: a language A such that A is NP-hard and A £ NP. 

NP-complete problem : a decision problem equivalent to deciding membership in an 
NP-complete language. 

NP-hard language: a language A such that every language in complexity class NP 
is polynomial-time reducible to A. 

oracle (for a language) : a machine state that decides whether or not a given string is 
in the language. 

oracle Turing machine: a 6-tuple M = (K , s, h . E, d or A , L), equipped with a spe- 
cial second tape on which it can write a string in the alphabet of an oracle for 
language L (which might be different from E). (Aside from oracle steps, it is a 
Turing machine.) 

palindrome: a string that is identical to its reverse. 

parallel computation model: a computational model that permits more than one 
instruction to be executed simultaneously, instead of requiring that instructions be 
executed sequentially. 

parsing a string : in theoretical computer science, a derivation. 
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partial function: an incomplete rule that assigns values to some elements in its do- 
main but not necessarily to all of them. 

partial function (1>m induced by a TM M: the rule that associates to each input v 
for which the M-computation halts the output 4>m(v), and is otherwise undefined. 

partial recursive function: a partial function derivable from the constant zero func- 
tions Cn(xi, ■ ■ ■ ,x n ) = 0, the successor function a(n) = n + 1, and the projection 
functions 7r”(ari, . . . , x n ) = Xi, using multivariate composition, multivariate primi- 
tive recursion, and unbounded minimalization. 

pattern (over an alphabet V ): a string of variables over that alphabet; regarded as 
present in a particular word w £ V* if there exists an assignment of strings from V + 
to the variables in that pattern such that the word formed thereby is a substring 
of w. 

polynomial hierarchy PH: the union of the complexity classes E (( . for n > 0. 

polynomial-space computation (of a function by a TM M): a computation by M 
of that function such that there exists a polynomial function p(n) such that for every 
input of size n, the calculation workspace takes at most p(n) positions on the tape. 

polynomial-time computation (of a function by a TM M): a computation by M of 
that function such that there exists a polynomial function p(n) such that for every 
input of size n, the calculation takes at most p(n) steps. 

positive closure (or Kleene plus) of a language L: the set of all iterated concate- 
nations of words in L excluding the empty word, denoted L + . 

nth power of a language: the set of all iterated concatenations w± W 2 ■ ■ -w n where 
each Wi is a word in the language. 

PRAM memory conflict: the conflict that occurs when more than one processor 
attempts concurrently to write into or read from the same global memory register. 

PRAM parallel random access machine: a model of parallel computation as a set 
of global memory registers and a set of processors, each with access to an infinite 
sequence of its own local registers. 

primitive recursion: a restricted way of defining f(n + 1) in terms of f(n). 

primitive recursive function : any function derivable from the constant zero func- 
tions Ck(xi, ■ ■ ■ , Xk) = 0, the successor function a(n) = n + 1, and the projection 
functions ir™(xi , . . . , x n ) = Xi, using multivariate composition and multivariate prim- 
itive recursion. 

probabilistic Turing machine: a nondeterministic Turing machine M with exactly 
two choices of a next state at each step, both with probability \ and independent of 
all previous choices. 

production rule (in a grammar) of the form a—>/3: a rule for making a substitu- 
tion in a string; iterative application of the production rules generates all the words 
of the language of the grammar. 

projection function , n-place: a function ( x \ . . . . . x n ) = Xi that maps an n-tuple 
to its 'ith coordinate. 

pumping lemma: any one of several results in formal language theory concerned with 
rewriting strings. 

pushdown automaton (PDA): a (possibly non-deterministic) finite-state automaton 
equipped with an auxiliary stack. 
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random access machine (RAM): a computation model with several arithmetic 
registers and an infinite number of memory registers. 

randomized algorithm: an algorithm that makes random choices during its execu- 
tion, guided by the output of a random (or pseudo-random) number generator. 

recursive language : a language with a decidable membership question. 

recursive set: a set whose characteristic function is recursive. 

recursively enumerable set: a set that is either empty or the image of a recursive 
function. 

reducibility in polynomial-time (of language A to language B): the existence of a 
polynomial-time computable function / such that x £ A if and only if f(x) £ B , for 
each string x in the alphabet of language A; denoted by AK^B. 

reduction: a strategy for solving a problem by transforming its natural form of input 
into the input form for another problem, solving that other problem on the trans- 
formed input, and transforming the answer back into the original problem domain. 

regular expression (over an alphabet V): a string w in the symbols of V and the 
special set { e, ),(,+,* } such that w £ V or w = e, or (continuing recursively) 
w — ( a/3 ), (a + (3), or a*, where a and (3 are regular expressions. 

regular (or type 3) grammar: a grammar such that every production a — > /? has 
antecedent a £ N and consequent (3 £ TUTiVU{A}. 

regular language: a language that can be obtained from elements of its alphabet V 
using finitely many times the operations of union, concatenation and Kleene star. 

regularity problem (for grammars): deciding whether L(G) is a regular language. 

reverse (of the string x): the string x R obtained by writing x backwards. 

running time: the number of primitive operation steps executed by an algorithm, usu- 
ally expressed in big-O asymptotic notation (or sometimes ©-notation) as a formula 
based on the input size variables. 

solvable problem: a problem that can be decided by a recursive function. 

space complexity (of an algorithm): a measure of the amount of computational space 
needed in the execution, relative to the size of the input. 

sparse language: a language A for which there is a polynomial function p(n) such 
that for every n £ N , there are at most p(n) elements of length n in A. 

square (over an alphabet): the pattern xx , or any word having that pattern. 

square-free word: a word having no subwords with the pattern xx. 

start symbol (in a grammar): a designated nonterminal from which every word of the 
language is generated. 

state diagram (for an FSR): a labeled digraph whose vertices represent the states 
and whose arcs represent the transitions. 

string (accepted by an FSR or NDFSR): a string such that the automaton ends up in 
an accepting state, immediately after the last transition. 

string (accepted by a PDA): a string that, when supplied as input, ultimately can 
lead to the stack being empty and the PDA being in an acceptance state after the 
last transition. 

string (accepted by a TM M ): a string w such that M halts on input w. 
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string (over an alphabet): a finite sequence of symbols from that alphabet. 

substitution (for the alphabets V in the alphabet U): a mapping s : V — >2 U , which 
means that each symbol b £ V may be replaced by any of the strings in the set s(6); 
extends to strings of V*. 

terminal (in a grammar): a symbol that cannot be replaced by other symbols. 

time complexity (of an algorithm): a function representing the number of operations 
or the running time needed, using the size of the input as its argument. 

total function: a partial function defined on all of its domain, i.e., a function. 

tractable problem: a problem that can be solved by an algorithm with polynomial- 
time complexity. 

A -transition (in a NDFSR): a transition that could occur without reading any symbols 
of the input string. 

transition table (for an FSR): a table whose rows are indexed by the states and 
whose columns are indexed by the symbols, such that the entry in row r and column 
c is the state to which the FSR moves if it reads symbol c while in state r. 

trapping state (of a finite automaton): a non-accepting state q from which every 
outward arc is a self- loop back into q. 

trio: a nontrivial family of languages closed under A- free morphisms, inverse mor- 
phisms, and intersection with regular languages. 

Turing-acceptable language: a language such that has a TM M that accepts it. 

Turing-computable function: a function such that there is a TM M with / = 0m- 

Turing-decidable language: a language whose characteristic function is Turing- 
computable. 

Turing machine (TM): an automaton whose tape can move one character in either 
direction and that can replace the symbol it reads by a different symbol. 

Turing-p-reducibility (of language A to language B ) : the existence of a deterministic 
oracle TM M B that decides language A in polynomial time. Notation: A<^B. 

Turing’s test (of whether a given computer can think): are its responses to written 
questions distinguishable from human responses by a person who does not know 
whether a computer or a person gave the response? 

type 0 grammar: a grammar with no restrictions. 

type 1 grammar: a length-increasing grammar, or equivalently, a context-sensitive 
grammar. 

type 2 grammar: a context-free grammar. 

type 3 grammar: a regular grammar. 

unambiguous context-free language : a context-free language L that has a context 
free grammar that is not ambiguous. 

unbounded minimalization: a way of using a function or partial function to define 
a new function or partial function. 

uncomputable function: a function whose values cannot be calculated by a Turing 
machine (or by a computer program) . 

undecidable problem : a decision problem whose answers cannot be given by a Turing 
machine (or by a computer program) . 
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universal Turing machine: a TM that can simulate every other TM. 

variable (over an alphabet V): a symbol not in V whose values range over V*. 

word (over an alphabet): usually a finite sequence of symbols (same as string ), some- 
times a countably infinite sequence. 

word equation (over an alphabet V"): an expression a = (3, such that a and (3 are 
words containing letters from V and some variables over V . 

word inequality: the negation of a word equation, commonly written as a//l. 

worst-case running time: the maximum number of execution steps of an algorithm, 
usually expressed in big-O asymptotic notation (or sometimes ©-notation) as a for- 
mula based on the input size variables. 


16.1 COMPUTATIONAL MODELS 

The objectives of a computer, no matter what special input/output or memory devices 
are attached, are ultimately to make logical decisions and to calculate the values of a 
function. A decision problem can be represented as recognizing whether an input string 
is in a specified subset. Calculating a function amounts to accepting an input string 
and producing an output string. At this fundamental level, the fundamental models in 
Table 1 can serve as the theoretical basis for all sequential computers. 


16.1.1 FINITE STATE MACHINES 
Definitions: 

A ( deterministic ) finite-state recognizer (often abbreviated FSR) models a com- 
puter for decision-making as a 5-tuple M = ( K , s, F, S, S) such that: 

• K is a finite set whose members are called the states; 

• s £ K (s is called the starting state); 

• F C K (each member of F is called an acceptance state); 

• £ is a finite set called the alphabet of symbols 

• 5 : K x E — > K (A is called the transition function). 

The computer model for a finite-state recognizer M = (K,s, F,T,,5) consists of a 
logic box, programmed by the transition function S. It is equipped with a read-only 
head that examines an input tape that moves in only one direction. Whenever it reads 
symbol c on the input tape while in state q , the computer switches into state S(q,c) 
and moves on to read the next symbol. The string is considered to be accepted if the 
automaton is in an acceptance state after the last transition. 


X 

y x x y y 

reading head \ 

F - 

>qo 

f 35 Qi 
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Table 1 Fundamental computational models. 


model 

description 

comment 

FSR = (K,s,F,E,S) 

K = set of states, start at s £ K 
accepting states A C K 
input alphabet £ 
transition fn 6: K xS-^A' 

Finite state recognizer: 
scans a tape once; de- 
cides whether to accept. 

recognizes regular 
languages 

NDFSR = (K, s, F, E, A) 

K, s, F, £ like FSR 

trans. relation A : K x £* — > 2 K 

nondeterministic FSR. 

equivalent to FSR 

FSM = (K, s, £/, Ti 0 , S, t) 

K = set of states, start at s £ K 
in-alphabet £/, out-alphabet So 
transition fn 6 : K x £/ — > I\ 
output fn S : K x £/ — > S * Q 

Finite state transducer: 
(also called “finite state 
machine with output”) 


Mealy = (AT, s, £/, So, <5, r) 

K, s, Ej, So, S like FSM 
output fn S : K x £/ — > So 

Mealy machine: writes 
a single output symbol 
for each input symbol. 

equivalent to FSM 

Moore = (AT, s, Ej, So, <5, r) 

K, s, Ej, S 0 , <5 like FSM 
output function 5 : K —> So 

Moore machine: output 
symbol depends only on 
state prior to transition. 

equivalent to FSM 

PDA = ( K , s, A, S, T, A) with AT, 
s ,F, S like FSR, stack alphabet T 

A C (K x S* x r*) x (AT x r*) 
transition relation A is finite set 

Pushdown automaton: 
uses a stack as a 
computational resource, 
nondeterministic. 

recognizes context- 
free languages 

TM = (K, s, h, S, 5) 

K = set of states, start at s £ K 
halting state h ^ K, alphabet S 
5:KxT,—>KxT,x {L, R} U {h} 

Turing machine: has 
two-way tape with 
rewritable symbols. 

decides member- 
ship in recursive 
sets 


A state diagram for an FSR M = (A", s, F,S,S) is a labeled digraph whose vertex 
set is AT, and such that for each state q £ K and each symbol c £ £ there is an arc 
from vertex q to vertex S(q, c), labeled with the symbol c. Sometimes a single arc is 
labeled with more than one symbol, instead of drawing two arcs from the same state 
to the same state. The starting state is designated by an entering arrow “ — ►” and the 
accepting states are indicated by a double circle. 

A transition table for an FSR M = (K, s, A, £, S) is a table whose rows are indexed 
by the states in K and whose columns are indexed by the symbols in £, such that the 
entry in row q and column c is 5(q, c). The starting-state row label is marked with a 
“>” and the acceptance state row labels are underscored. 

A configuration for an FSR M = (K, s, A, £,<5) is a pair (q,w) such that q £ K and 
w £ £*. The pair ( q,w ) signifies that the automaton is in state q with the read-only 
head positioned at the initial character of the string w. (Since the read-only head moves 
in only one direction, a common assumption is that it consumes each character that it 
reads.) 
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The FSR configuration (q. w) yields the configuration (q' , w') in one step if deleting 
the initial symbol, call it c, of the string w yields the string w' and if S(q,c) = q’ . This 
relationship between configurations is denoted by (q, w) \~m (q',w'). 

A nondeterministic finite-state recognizer (often abbreviated NDFSR) is a 5- 
tuple M = ( K, s, F, E, A) just like an FSR, except that A is a finite subset of K x E* x K 
and is called the transition relation. 

The computer model for an NDFSR M = (K. s, F, E, A) is like the computer model 
for an FSR. However, whenever it reads string u on the input string while in state q , 
the computer switches into any of the states in the set A (q, u) and moves on to read 
the next symbol. 

A state diagram for an NDFSR M = ( K , s, F, E, A) is a labeled digraph whose vertex 
set is A", and such that for each triple ( q,u,p ) £ A there is an arc from vertex q to 
vertex p, labeled with the string u, which may be the empty string A. 

A transition of an NDFSR M = (K,s, F, E,A) is a triple ( q,u,p ) £ A. The idea 
is that from state q , the NDFSR M may read the substring u, and then transfer into 
state p. 

A A -transition in a NDFSR M = (K, s,F, E, A) is a transition (q,X,q') £ A that 
could occur without reading any symbols off the input string. That is, it reads the 
empty string A. 

A configuration for an NDFSR M = (. K , s, A, E, A) is a pair (q, w) such that q £ K 
and w £ E*. 

The NDFSR configuration (q,w) yields the configuration ( q',w ') in one step if 
there is an initial prefix u on the string w whose deletion yields the string w', and if 
( q,u,q ') £ A. Notation: (q,w) I ~m ( q',w '). 

A finite automaton is either an FSR or an NDFSR. 

A computation for a finite automaton M is a sequence of configurations (qo,Wo), 
(qi,u>i), . . . , (q n ,w n ) such that \~m ( qi,Wi ), for i = 1, . . . ,n. This is called 

a computation of ( q n ,w n ) from ( yu . iuq ) . 

For any finite automaton, the configuration (q,w) yields the configuration (q 1 ,w'), 
denoted (q,w) \~* M ( q',w' ), if there is a computation of ( q',w ') from ( q,w ). 

A string w £ E* is accepted by an FSR or NDFSR M = ( K , s, F, E, 5 or A) if there 
is an accepting state q £ F such that (s,w) \~* M (q, A). That is, machine M accepts 
string w if, starting in state s at the first symbol, its transition sequence ultimately 
leads to an accepting state, immediately after its last transition. 

The language accepted by a Unite automaton M is the set of all strings accepted 
by M. It is denoted L(M). 

Finite automata Mi and M 2 are equivalent if L(A / f 1 ) = L(M 2 ), that is, if they accept 
the exact same language. 

A trapping state of a finite automaton M is a non-accepting state q from which every 
outward arc is a self- loop back into q. 

An implicit trapping state is a convention used to simplify state diagrams. If from 
some state there is no exiting arc labeled with a particular symbol, then that combina- 
tion is deemed to lead to the implicit trapping state. 

A trapping group of a finite automaton M is set of non-accepting states from which 
there is no directed path in the state diagram to an accepting state. 
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A ( deterministic ) finite transducer models a function-calculating computer as a 
6 -tuple M = ( K , s, £/, So, S, r) such that: 

• K is a finite set (its members are called states); 

• s £ K (s is the starting state); 

• S/ is a finite alphabet of input symbols; 

• So is a finite alphabet of output symbols ; 

• 5: K x S/ — > K (5 is called the transition function); 

• t: K x Sj — > S q (t is called the output function). 

A finite-state machine with output is another name for a finite transducer. 

A Mealy machine is a finite transducer whose output function always produces a 
single symbol. 

A Moore machine is a Mealy machine such that for every state k and every pair of 
input symbols si and S 2 , the outputs r(k,s i) and r(k, S 2 ) are the same. 

A finite-state machine is a finite automaton or a finite transducer. 

Facts: 

1. Finite state machines are the design plan of many practical types of electronic control 
devices, for instance in wristwatches or automobiles. 

2 . Terminological usage has evolved over several decades. The following table provides 
a quick guide to current usage regarding output capacity: 


terminology 

output capacity 

recognizer 

Mealy machine 
transducer 

none 

one output symbol for each input symbol 
arbitrary output string for each input symbol 


The phrase “finite state machine” refers to a finite state model that may or may not 
have output capacity and that may or may not be nondeterministic. 

3 . The nondeterminism of an NDFSR is that possibly u = A or that there might also 
be a transition ( q,u,p '), so that from the same state q the NDFSR M might also read 
substring u and transfer either into state p or into state 7 /. 

4 . For every NDFSR, there is an equivalent FSR. (M. Rabin and D. Scott, 1959) 

5 . In software design, NDFSRs are commonly used in preference to deterministic FSRs 
because they often achieve the same task with fewer states. 

6 . NDFSRs are often defined so that A-transitions are the only possible instances of 
non-determinism. In this seemingly more restrictive kind of NDFSR, the second com- 
ponent of a transition ( q,u,p ) is either a single symbol or the empty string. 

7 . The class of languages accepted by finite automata is closed under all of the following 
operations: 

• union; 

• concatenation; 

• Kleene star (see §16.3.2); 

• complementation; 

• intersection. 
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8. Kleene's theorem : A language is regular (§16.3.4) if and only if it is the language 
accepted by some finite automaton. (S. Kleene, 1956) 

9 . Some lexical scanning processes of compilers are modeled after finite automata. 

10 . The relation “yields” for finite automata is the reflexive, transitive closure of the 
relation \~m- 

11. Both Moore machines and Mealy machines have the computational capability of 
an unrestricted finite transducer. 

12. More comprehensive coverage of finite state machines is provided by many text- 
books, including [Gr97] and [LePa81]. 

Examples: 

1. The FSR specified by the following transition table and state diagram decides 
whether a binary string has evenly many Is. Formally, M = (K. s, F, S, 5) with 
K = {Even, Odd}, s = Even, F = {Even}, and E = {0, 1}. 



2. In one early form of the language BASIC, an identifier could be a letter or a letter 
followed by a digit. The following state diagram specifies an FSR that accepts this 
restricted form of BASIC identifier. 



3 . A “proper fixed-point numeral” is a nonempty string of decimal digits (the “whole 
part”), followed by a decimal point, and then another non-empty string (the “fractional 
part”) of digits. For instance, the number zero would be represented as “0.0”. The 
following FSR decides whether the input string is a fixed-point numeral. 


0,...,9 0,...,9 anything 



4 . An “integer” in some programming languages is a nonempty string of decimal digits, 
possibly preceded by a sign + or — . The following NDFSR decides whether the input 
string is an integer. 



© 2000 by CRC Press LLC 






5. The following finite-state transducer has {0, 1, $} and {0, 1} for its input and output 
alphabets, respectively, where “$” serves as an end-of-string marker. It reads a binary 
numeral, starting at the units digit, and prints a binary numeral whose value is double 
the input numeral. 


6 . The following finite state machine models a vending machine for a 20-cent local 
newspaper. The possible inputs are a nickel, a dime, and a push of a button that releases 
the newspaper if enough change has been deposited. The states indicate the amount of 
money that has been deposited. This machine may be regarded as a transducer that 
produces symbol N (newspaper) if it receives input B while in state 20. 




7. NDFSRs can be used to model various kinds of solitaire games and puzzles. For 
instance, making a complete knight’s tour of a chessboard is such a puzzle. At each 
stage, there may be some moves that ultimately permit a complete tour and some other 
moves that are traps. 


16.1.2 PUSHDOWN AUTOMATA 
Definitions: 

A pushdown automaton (PDA) is essentially a (possibly non-deterministic) finite- 
state machine equipped with an auxiliary stack. A pushdown automaton is given by a 
6-tuple M = ( K , s, F, E, T, A) such that: 

• K is a finite set (its members are called states) ; 

• s € K (s is called the starting state); 

• F C K (each member of F is called an acceptance state); 

• E is a finite set called the alphabet of input symbols; 

• r is a finite set called the alphabet of stack symbols; 

• A is a finite subset of (K x E* x T*) x (K x r*) (A is the transition relation). 

A transition of a PDA M = ( K , s, F, E, T, A) is a pair ((p, u, (3), ( q , 7)) € A. The idea 
is that from state p , the PDA M may read the substring u and the stack substring (3 , 
and transfer into state q while popping (3 and pushing 7, thereby replacing (3 by 7. 
Note : A PDA is frequently defined so that the only strings that can be read or written 
or pushed or popped are single characters and the empty string. This has no effect on 
the computational generality, but it can lead to the need for more states to accomplish 
a given task. 
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The computer model for a PDA M = (K, s, F, E, T, A) consists of a logic box, pro- 
grammed by A, equipped with a read-only head that examines an input tape that moves 
in only one direction, and also equipped with a stack. When it reads substring u from 
input while in state p with substring 0 at the top of the stack, the computer selects a 
corresponding entry from A and makes the indicated transition. 


stack 



An input string is accepted by a PDA if the stack is empty and the PDA is in an 
acceptance state after the last transition. 

A configuration for a PDA M = (K , s, F, E, T, A) is a triple (p, u, 0) such that p £ K, 
aeS*, and 0 £ r*. 

The PDA configuration (p. ux. 0a) yields the configuration ( q , x, 7 a) in one step 
if and only if there is a transition ((p,u,0), (q, 7)). This relationship between configu- 
rations is denoted by ( p,ux,0a ) \~m (<7 , x, 7a). 

A computation for a PDA M is a sequence of configurations Co, C\, , C n such that 
Cj_ 1 \~ m C u for i = 1, . . . , n. 

The PDA configuration C yields the configuration C", denoted C \~* M C', if there is a 
computation of C from C. 

A string w £ E* is accepted by a PDA M = ( K , s, F, E, T, A) if there is an accepting 
state q £ F such that (s,w) \~* M (q, A, A). That is, machine M accepts string w if, 
starting in state s at the first symbol, its transition sequence can ultimately lead to an 
accepting state and an empty stack after it has read the last symbol. 

The language accepted by a PDA M is the set of all strings accepted by M. It is 
denoted L{M). 

A state diagram for a PDA M = ( K , s, F, S, T, A) is a labeled digraph whose vertex 
set is K, and such that for each transition ((p, u, 0), (q, 7)) £ A there is an arc from 
vertex p to vertex q, labeled (u, 0) 1 — *y. Sometimes a single arc is labeled with more 
than one symbol, instead of drawing two arcs from the same state to the same state. 
The starting state is usually designated by an entering arrow, and the accepting states 
are usually indicated by a double circle. 


Facts: 

1. The PDA model was invented by A. G. Oettinger in 1961. 

2. A language L is context-free (see §16.3.3) if and only if there is a pushdown au- 
tomaton M such that L is the language accepted by M. (M. Schutzenberger 1963, and 
independently by N. Chomsky and by J.Evey) 

3. A PDA can test whether a string is a palindrome or whether all the left and right 
parentheses are matched, but an FSR cannot. 
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4. The class of languages accepted by deterministic PDAs is smaller than the class 
accepted by non-deterministic PDAs. 

5. More comprehensive coverage of pushdown automata is provided by many textbooks, 
including [Gr97] and [LePa81]. 

Examples: 

1 . The following PDA decides whether a sequence of left and right parentheses is well- 
nested, in the sense that every left parenthesis is uniquely matched to a right parenthesis, 
and vice versa. It is necessary and sufficient that in counting left and right parentheses 
while reading from left to right, the number of right parentheses never exceeds the 
number of left parentheses and that the total counts are the same. 


-4 X 

2. The following PDA decides whether a string in the alphabet {0,1, m} has the 
form bmb r where b is a bitstring and b r its reverse, i.e., the same string written back- 
wards. The m in the middle signals when to switch from pushing symbols onto the 
stack to popping them off. 



0,X->0 1, X -> 1 0,0 ->X 1,1- 

r\ /< 

m,X- 


A# 



3. The following non-deterministic PDA decides whether a binary string is of the 
form bb r , that is, a bitstring followed by its reverse. In effect, it considers every character 
interspace in the string as the possible middle. 


0,X^»0 1,jX-»1 0,0 -> X 1,1— » x 

X,X->X 


A# 



16.1.3 TURING MACHINES 
Definitions: 

A Turing machine (TAT) models a computer as a 5-tuple M = ( K,s,h,T,,S ) such 
that: 

• K is a finite set not containing h (elements of K are called states; h is called the 

halting state); 

• s € K (s is called the starting state); 

• £ is a finite set of symbols, including the blank symbol (£ is called the 

alphabet); 

• 5: K x £ — * (K x £ x {L,R}) U {h} (5 is called a transition function). 

An m x n Turing machine is a Turing machine with m states and n symbols. 
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The computer model for a Turing machine M = (iv , s, h, E, S) consists of a logic box, 
programmed by S, equipped with a read-write head that examines an input tape with 
a left end, but no right end. To start a computation, the input string is written at the 
left end of the tape, and the rest of the tape is filled with blanks. The read-write head 
starts at the leftmost symbol. Whenever the Turing machine reads symbol b on the 
input string while in state g, its internal logic produces the triple < 5(q,b) = (p,c,D) the 
computer switches into state p, replaces b by c, and moves one space in direction D, 
that is, either to the left (L) or to the right (R), whereupon it is ready to read the next 
symbol. 



A transition table for an m x n Turing machine is an m x n table whose rows are 
labeled with the states, whose columns are labeled with the symbols, such that the 
entry in row q and column b is 6(q, b). Thus, a typical table entry is a triple indicating 
to which state to switch from g, the new symbol to replace b, and whether to move one 
square to the right or one to the left. However, another possibility is that a table entry 
could be the halt state h. 

In an alternative definition of a Turing machine, each transition is a change of 
state and either a change of symbol or a one-symbol move to the right or left. In this 
case the transition table entries are pairs, and it tends to take more states and symbols 
to achieve a given objective. 

A configuration for a Turing machine is a quadruple (g, it, b , v), such that q £ K U {h}, 
u,v£ E*, and b £ E, commonly written as a pair ( q,ubv ). This means that the Turing 
machine is in state g, that the present value of the tape is ubv, that the present location 
of the read-write head is at the indicated instance of the symbol b, and that the rest of 
the tape to the right of the string ubv is filled with blanks. 

A starting configuration for a Turing machine is a configuration of the form (s, A bv). 
This means that the string bv is supplied to the given Turing machine as input, in the 
starting state s. 

A halting configuration for a Turing machine is a configuration of the form ( h,ubv ). 
This means that the Turing machine has entered the halting state h, and that whatever 
is on the tape is to be interpreted as the output. 

A hanging configuration for a Turing machine is a configuration of the form (g, A bv), 
such that the transition value 6(q,b) tells the Turing machine to move left (L), i.e. , off 
the left end of the tape. 

The Turing machine configuration (p, ubv) yields the configuration (g, xcy) in one 
step if and only if the transition 5{p,b) would change configuration ( p,ubv ) to config- 
uration ( q,xcy ). This relationship between configurations is denoted by ( p,ubv ) \~m 
(?, xcy). 
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An infinite loop for a Turing machine is an infinite sequence of configurations Co, Ci, 
C 2 , ■ ■ ■ such that Ci- 1 I ~m Ci, for i = 1,2, 

The M -computation for input v to a Turing machine M is one of three possibilities: (1) 
the finite sequence of configurations C 0 = (s, A#u), Ci, . . . , C n such that Cj_i \~ M Ci, 
for i = 1 in which C n is a halting configuration; (2) the finite sequence of 

configurations Co = (s, \#v), C\, . . . , C n such that Ci - 1 \~m Ci, for i = 1 ,...,n, in 
which C n is a hanging configuration; (3) the infinite sequence Cq = (s, X#v), C\ ,C<i, ■ ■ ■ 
such that Ci - 1 I ~m Ci, for i = 1, . . . ,n. 

The output of an M-computation for input v to a Turing machine M is the string 4>m(v) 
from the left end of the tape up to the last non-blank character if the M-computation 
halts, and undefined otherwise. 

The partial function 0 m induced by a Turing machine M is the rule that as- 
sociates to each input v for which the M-computation halts the output 4>m(v), and is 
otherwise undefined. 


A function /: S*— >S* is M-computable by a Turing machine M = (K, s, h, E, 5) if the 
machine M halts for all inputs v £ E* and if the machine M computes the function /, 
that is, f(v) = 4 >m(v) for all v € E*. 

A function /: E*— >E* is Turing-computable if there is a Turing machine M = 
( K , s, h, E, <5) such that / is M-computable. 

A Turing machine M = (K, s, h, E, S) simulates another Turing machine M' = (K' , s' , 
h, S', S') if there exists a Turing-computable function (3 : S'*— >S* such that <^m(/3(w)) = 
for all w € domciin^M 1 ) and 4>m{(3{w)) is undefined for all w /Gdomain((j)M') ■ 

A universal Turing machine is a Turing machine U = ( Kjj , su, hjj, E u, Sjj) that can 
simulate every other Turing machine, in the following sense. There is a rule au for 
encoding any given Turing machine M and a rule f3jj for encoding any given input w 
to M, such that (f>u (ajj (M )#(3u(w)) is defined and equals (/)m(w) whenever is 

defined, and is undefined otherwise. 


A string vj 0 E” is Turing machine M accepted by the Turing machine M = 

( K , s, h, E, <5) if M halts on input w. 

A language L C E* is Turing machine M accepted by the Turing machine 

M = ( K , s, h, E, 5) if L = { w £ E* | M accepts w }. 

A language L C E* is Turing-acceptable if there exists a Turing machine M that 
accepts it. 


The characteristic function \l- E *— >{yes, no} of a language LC E* is given by the 
rule 



if w € L\ 
if w /gL. 


A language L C E* is a Turing-decidable language if its characteristic function is 
Turing-comput able . 

A subset-membership decision problem is unsolvable if it does not correspond to a 
Turing-decidable language. 

An n-state busy beaver machine is an n-state Turing machine on the alphabet 
E = {#,1} that accepts a two-way infinite input tape filled with #s and halts after 
placing a maximum number of Is on the tape. (The name busy beaver derives from an 
analogy between the machine piling up Is and a beaver piling up logs.) 
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The busy beaver function BB(n ) has as its value the number of Is on the output 
tape of an n-state busy beaver machine. 

A linear bounded automaton (or LBA) is representable as a Turing machine that is 
fed only a finite stretch of tape containing the input word, rather than an infinite tape. 

A nondeterministic Turing machine is defined like a Turing machine, except that 
instead of a transition function that assigns a unique change of symbol and direction 
of motion for the read-write head, there is a transition relation that may permit more 
than one possibility. 


Facts: 

1. A Turing machine is commonly regarded as a program to compute the partial func- 
tion (j>M ■ 

2. Every Turing-decidable language is Turing-acceptable. 

3 . If a language L C X* is Turing-decidable, then its complement L is also Turing- 
decidable. 

4 . Every Turing-acceptable language L C S* whose complement L is also Turing- 
acceptable is a Turing-decidable language. 

5 . The following problems about Turing machines are unsolvable: 

(a) Given a TM M and an input string w, does M halt on input wl 

(b) Given a TM M, does M halt on the empty tape? 

(c) Given a TM M, does there exist an input w for which M halts? 

(d) Given a TM M, does M halt on every input string. 

(e) Given two TMs Mi and M 2 , do they accept the same input? 

(f) Given two numbers n and k, is BB(n) > k ? 

6. In view of part (d) of the preceding fact, there is no way to tell whether an arbitrary 
computer program in a general language always halts, much less whether it calculates 
what it is supposed to calculate. 

7 . It is possible to construct a universal Turing machine. 

8. A universal Turing machine with six states and four symbols was constructed in 
1982 by Y. Rogozhin. 

9 . The busy beaver problem was invented by Tibor Rado in 1962. 

10 . Turing machines have been extended in several ways, including the following: in- 
finiteness in two directions, more work tapes, two or more tapes, two- or more dimen- 
sional tapes, nondeterminism. 

11 . Some of the extensions of Turing machine can perform computations more quickly 
and are easier to program. 

12 . Any function that can be computed by a Turing machine with a two-way infinite 
tape can also be computed by some standard Turing machine. 

13 . Any function that can be computed by a Turing machine with k tapes can also be 
computed by some standard Turing machine. 

14 . Any function that can be computed by a Turing machine with a two-dimensional 
tape can also be computed by some standard Turing machine. 

15 . Any function that can be computed by a nondeterministic Turing machine can also 
be computed by some standard Turing machine. 
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16 . The following table gives known values and lower bounds for the busy beaver 
function: 


n 

1 2 3 4 5 6 8 

BB{n ) 

1 4 6 13 > 4098 > 95,524,079 > 10 44 


17. The finite amount of workspace to which a linear bounded automaton is restricted 
causes it to be less powerful than a Turing machine with infinite tape. However, it is 
more powerful than a pushdown automaton. 

18 . Alan M. Turing (1912-1954) was a British mathematician whose cryptanalytic work 
during World War II lead to the decryption of ciphertext from the German cipher 
machine called the Enigma. 

19 . Turing proposed that a machine be regarded as “thinking” if its responses to writ- 
ten questions could not be distinguished from those of a person. This criterion is called 
Turing’s test. 

20 . More comprehensive coverage of Turing machines is provided by many textbooks, 
including [Gr97] and [LePa81]. 


Examples: 

1. This is a 1-state Turing machine with alphabet E = {0,1,#} that changes every 
character preceding the first blank into a blank. It accepts any string over its alphabet. 

0 1 # 

— > a #aR #aR h 

2. This 3-state Turing machine with alphabet E = {0,1,#} doesn’t change its input 
tape at all. It halts whenever it encounters the third ‘1’. Thus, it accepts any tape with 
at least three l’s but accepts no other strings. 

0 1 # 

— > a OaR IbR #aR 

b 0 bR 1 cR #bR 

c 0 cR h #cf? 


3 . This 3-state Turing machine with alphabet E = {1,#} adds two positive integers, 
each represented as a string of Is. For instance, the tape 111#11### • • • becomes 

mu###--, 

i # 


-» a laR IbR 
b IbR #cL 


c #cR h 

4. This 2-state Turing machine shows that BB( 2) is at least 4. 

# 1 

-► a lbL IbR 
b laR h 

5. This 3-state Turing machine shows that BB( 3) is at least 6. 

# 1 


a IbR 1 cL 


b 1 cR 1 h 
c laL #bL 
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16.1.4 PARALLEL COMPUTATIONAL MODELS 


Definitions: 

A parallel computation model permits more than one instruction to be executed 
simultaneously, instead of requiring that they be executed sequentially. 

An interconnection network models parallel computation as a digraph in which each 
vertex represents a processor. In each phase of the computational process, a processor 
communicates with its neighbors and makes a computation. 

An n-dimensional cellular automaton is an interconnection network in which there 
is a processor at each integer lattice point of n-dimensional Euclidean space, and each 
processor communicates with its immediate neighbors. 

A random access machine (RAM) has several arithmetic registers and an infinite 
number of memory registers (often modeled as an infinite array), and of which can be 
accessed immediately via its address (the index in the array). 

A parallel random access machine (PRAM) models parallel computation as a 
set of global memory registers {Mj | j = 1,2,...} and a set of processors {Pj \ 
j = 1,2,...}. Each processor Pj has access to an infinite sequence of local registers 
{ Rj,k I k = 1,2,...}. 

In a PRAM, a register (global or local) may contain a single integer. It is local if 
it can be accessed only by a single processor and global if it can be accessed by all 
processors. 

In a PR AM, a processor performs read and write instructions involving global memory 
and other instructions involving only its local memory. All processors of a PRAM 
perform the same program in perfect synchrony, so that at any given time all processors 
that are not idle are all performing their task under the same instruction of the program. 

In a PR AM, the concurrent construct par [a < j < b\Pj : Sj means that each of the 
processors Pj for a < j < b is performing the operation Sj. 

In a PRAM, the read instruction READ(j) tells processor Pj to read the content of 
global register Mj into local register Pj j0 . 

In a PRAM, the write instruction WRITE(j) tells processor Pj to write the content 
of local register R i 0 into global register Mj. 

In a PRAM, a computation starts when all the processors execute the first instruction. 
It stops when processor Pi halts. The contents of the global memory are regarded as 
the output. 

In a PRAM, a memory conflict occurs when more than one processor attempts con- 
currently to write into or read from the same global memory register. 

In an exclusive read exclusive write (EREW) PRAM model, concurrent reads 
from and concurrent writes to the same location are not allowed. 

In a concurrent read exclusive write (CREW) PRAM model, concurrent reads 
are allowed, but not concurrent writes to the same location. 

In a concurrent read concurrent write (CRCW) PRAM model, concurrent reads 
from and concurrent writes to the same location are both allowed. 

In a common PRAM (CRCW 00 ™ PRAM) model, concurrent writes to the same 
location are permitted if all processors are trying to write the same data. 
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Facts: 


1 . Commercially available parallel computers often have an array of elements in which 
a single broadcast instruction applying to every element is executed simultaneously for 
all the elements. 

2. Random access machines are commonly thought to be close theoretical models of 
commercially available sequential computers. 

3. The indexing of registers and processors of a PRAM may be over any finite or 
countably infinite set. 

4. PRAM programs are often described using high-level programming language con- 
structs for array-processing that are similar to sequential array processing, except that 
the PRAM array locations are processed in parallel. 

5. Whereas linear time is regarded as fast for sequential processing, a parallel algorithm 
tends to be regarded as fast if it runs in O(lgn) time or less. 


Examples: 

1 . A parallel computer for sorting up to n items can be modeled as a row of processors 
Pi, Pi, . . . , P n . Joining each processor Pj such that 2 < j < n — 1 to and from its im- 
mediate predecessor Pj-i and its immediate successor Pj+i are arcs in both directions, 
as shown here. 


P| P: P„-l P„ 

©_©t • • • 1® T ® 


On the first phase and on all subsequent odd-numbered phases, each processor pair 
(P‘ 2 j+i- Pij+'i) compares items and swaps, if necessary, so that the smaller item ends up 
in the lower-indexed processor. On the second phase and on all subsequent even num- 
bered phases, each processor pair (P‘ 2 j- P‘ 2 j+i) compares items and swaps, if necessary, 
so that the smaller item ends up in the lower-indexed processor. After n phases, the 
items are completely sorted into ascending order. 

2. EREW PRAM : Ending the maximum: Given n numbers, with n = 2 r , store the 
numbers in global registers M n , . . . , Af 2n _i-. Then execute the following program: 

for * = r — 1 downto 0 

par[2* < j < 2‘ l+1 ]Mj := iriax{M 2 j, M^j+i} 

next i 

After r iterations of the loop body, the maximum appears in global register M\. 

3. CRCW com PRAM: Ending the maximum: Given n numbers, store the numbers 

in global registers Use processors Pij, 1 < i,j < n. Then execute the 

following program: 

par[l < i,j < n]P itj : M i+n := 0 

par[l <i,j < n\Pi,j '■ if M i: < Mj then Mi +n := 1 

{Mi +n = 0 if and only if Mj = max{Mi, . . . , M„}} 
par[l < i,j < n]Pij : if M n+ i = 0 then Mq := Mj 

This program is much faster than the EREW PRAM program, because all pairs are 
compared simultaneously in a single parallel step. 
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4. Game of Life: The Game of Life, invented at Cambridge by John H. Conway, a 
mathematician now at Princeton University, is played on an infinite checkerboard. The 
neighbors of a square are the eight squares that touch it, including those four at its 
corners. In the initial configuration Co of the game, some squares are regarded as a live 
and all others dead. Each configuration Ck gives birth to a new configuration Ck+ 1 , 
according to the following rules: 

• a live cell in configuration Ck remains alive if its has either two or three live 

neighbors, but no more; 

• a dead cell in configuration Ck becomes alive if and only if it has exactly three 

live neighbors. 

The following sequence of configurations illustrates these rules: 




The Game of Life can be regarded as a cellular automaton in which the squares are the 
processors and each processor is joined to its eight neighbors. 

5. A configuration in the Game of Life has periodicity n if the sequence of configu- 
rations to which its gives birth repeats every n configurations, and if n is the smallest 
such number. Here are three periodic configurations. 




period 1 5 


16.2 COMPUTABILITY 

The theory of computability is concerned with distinguishing what can be computed 
from what cannot. This is not a question of skill at performing calculations. The 
remarkable truth is that the impossibility of computing certain functions can be proved 
from the definition of what it means to compute a function. 


16.2.1 RECURSIVE FUNCTIONS AND CHURCH’S THESIS 

The implicit domain for the theory of computability is the set A f of natural numbers. 
The encoding of problems concerned with arbitrary objects into terms of natural num- 
bers permits general application of this theory. 
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Definitions: 


The n-place constant zero function is the function Q, (.x' i . . . . , x n ) = 0. 

The successor function is the function u(n) = n + 1 . 

The itli n-place projection function is the function . . . ,x n ) = Xi . 

The ( multivariate ) composition of the n-place function f(x i,...,x n ) and the n 
m-place functions <71(2:1, . . . , x m ), . . . , g n ( Xi, . . . , x m ) is the m-place function 
h(x 1, . . . ,x m ) = f(gi(xi, . . .,x m ), . . .,g n (xi, . . .,x m )- 


( Multivariate ) primitive recursion uses a previously defined (n + 2)-place function 
f(x i,...,x n +2) and a previously defined n-place function g( x±, . . . ,x n ) to define the 
following new (n+l)-place function: 

( g(xi, ... ,x n ) if x n+ i = 0; 

\ f (xi, . . . ,x n+ 1 ,h(x 1 , . . . ,x n ,x n+ i - 1)) otherwise. 


h(xi, . . .,x n+ i) = 


Unbounded minimalization uses an (n+l)-place function f(x 1, . . . ,x n +i) to define 
the following new n-place function, which is denoted Pm[f(x \, . . . , x n , m ) = 0]: 

_ J the least y such that /( x\, . . . , x n , y) = 0 if it exists; 

. 0 otherwise. 


g(x 1 ,,..,x n+1 ) = I 


An (n+l)-place function f(xi , . . . , x n+ i) is regular if for every n-tuple (x\, . . . ,x n ) 
there is a y £ Af such that f(x 1, . . . , x n , y) = 0 


The class V of primitive recursive functions is the smallest class of functions that 
contains: 

• the constant zero functions £ n (x 1, . . . , x n ) = 0, for all n S A f; 

• the successor function <r(n) = n- 1-1; 

• the projection functions 7 r"(a:i, . . . , x n ) = Xi, for all n £ Af and 1 < i < n; 
and is closed under multivariate composition and multivariate primitive recursion. 


The class 1ZT of recursive functions is the smallest class of functions that contains: 

• the constant zero functions ( n (xi, ■ ■ ■ , x n ) = 0, for all n £ Af; 

• the successor function <r(n) =n + 1; 

• the projection functions 7r"(2;i, . . . , x n ) = Xj, for all n £ Af and 1 < * < n; 

and is closed under multivariate composition, multivariate primitive recursion, and the 
application of unbounded minimalization to regular functions. 

A recursive function is a function in 7 ZT. 


Church’s thesis, or the Church-Turing thesis , is the premise that recursive func- 
tions and Turing machines are capable of representing every function that is computable 
or partially computable. 

A partial function on Af is a function whose values are possibly undefined for certain 
natural numbers. 

A partial function on Af is called total if it is defined on every natural number. 

The class VIZ of partial recursive functions is the smallest class of partial functions 
that contains the constant zero functions £, the successor function a, and the projec- 
tion functions tt” , and is closed under multivariate composition, multivariate primitive 
recursion, and the arbitrary application of unbounded minimalization. 
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A partial recursive function is a function in VIZ. 

A partial recursive function / is represented by the Turing machine M if machine 
M calculates the value /(n) for every number n on which / is defined and fails to halt 
for every number on which / is undefined. 

The Ackermann function A: Af x Af — > Af is defined as follows: 

A(0,j) = i + 1; 

A{i + 1, 0) = A(i, 1); 

A(i + 1, j + 1) = A(i, A(i + 1, j)). 


Facts: 

1. The standard integer functions of arithmetic, including addition, subtraction, mul- 
tiplication, division, and exponentiation, are all primitive recursive functions. 

2. A function is a partial recursive function if and only if it can be represented by a 
Turing machine. 

3. There are several other models of computation that are equivalent to partial recur- 
sive functions and to Turing machines, including labeled Markov algorithms and Post 
production systems (see [BrLa74]). 

4. Church’s thesis identifies formal concepts (recursive functions and Turing machines) 
with the intuitive concept of what is computable, so it is not something that is subject 
to proof. 

5. Church’s thesis is often invoked in the proof of theorems about computable functions 
to avoid dealing with low-level details of the model of computation. 

6. The Ackermann function is recursive but not primitive recursive. 

7. The Ackermann function grows faster than any primitive recursive function, in the 
following sense. For every primitive recursive function f(n), there is an integer n o such 
that f(n) < A(n,n ) for all n > no- 

Examples: 

1. Addition is primitive recursive. 

a(x, 0) = n\(x)] 

a{x,y + 1) = a(nl(x,y + l,a(x,y)). 

Then a{x, y) = x + y. 

2. Multiplication is primitive recursive. 

m(x, 0) = 0; 

m(x,y + 1) = a(m(x,y),Tr%(x,y)), where a(x,y) is addition. 

Then m(x , y) = x ■ y. 

3. Predecessor is primitive recursive. 

P( 0) = 0. 

p(x + 1) = k \{ x ). 

Then p(x) = x—1. 

4. Nonnegative subtraction is primitive recursive. 

s(x, 0) = Tr\{x). 
s{x,y+ 1) = p{x—y). 

Then s(a:, y) = x—y. 
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5 . The function p(n) = the nth prime number is a primitive recursive function. 

6. The Ackermann function is recursive but not primitive recursive. 


16.2.2 RECURSIVE SETS AND SOLVABLE PROBLEMS 
Definitions: 

The characteristic function of a set A is the function 

j., . f 1 if a: £ A; 

/(:r) = \0 if * /G4. 

A set A is recursive if its characteristic function is recursive. 

A problem is ( computationally ) solvable if it can be represented as a membership 
problem that can be decided by a recursive function. 

A set A C J\f is recursively enumerable (r.e.) if A = 0 or A is the image of a 
recursive function. 

A Godel numbering of a set S is a one-to-one recursive function g: S — » Af whose 
image in Af is a recursive set. 

Facts: 

1. If a set A and its complement A are both recursively enumerable, then A is recursive. 

2. If an recursively enumerable set A is the image of a non-decreasing function, then 
A is recursive. 

3 . A set is recursively enumerable if and only if it is the image of a partial recursive 
function. 

4 . A set is recursively enumerable if and only if it is the domain of a partial recursive 
function. 

5 . The set of Turing machines has a Godel numbering. 

Examples: 

1. Every finite set of numbers is recursive. 

2 . The prime numbers are a recursive set. 

3 . The problem of deciding which Turing machines halt on all inputs is unsolvable. 
The set of Godel numbers for these Turing machines is neither recursive not recursively 
enumerable 

4 . For any fixed c £ A/", the problem of deciding which Turing machines halt when the 
number c is supplied as input is unsolvable. The set of Godel numbers for these Turing 
machines is recursively enumerable, but not recursive. 

5 . The problem of deciding which Turing machines halt when their own Godel number 
is supplied as input is unsolvable. The set of Godel numbers for these Turing machines 
is recursively enumerable, but not recursive. 

6. Hilbert's tenth problem : Hilbert’s tenth problem (posed in 1900) was the problem 
of devising an algorithm to determine, given a polynomial p(x \, . . . , x n ) with integer 
coefficients, whether there exists an integer root. Y. Matiyasevich proved in 1970 that 
no such algorithm exists. That is, the set of polynomials with integer coefficients that 
have an integer solution is not recursive. Hilbert’s tenth problem is called a “natural 
example” of an unsolvable problem, since the concepts used to define it are not from 
within computability theory (i.e., unlike problems concerned with the behavior of Turing 
machines). [Ma93] 
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16.3 LANGUAGES AND GRAMMARS 

Strings of symbols are a general way to represent information, both in written text and 
in a computer. A language is a set of strings that are used within some domain of 
discourse, and a grammar is a system for generating a language. A grammar is what 
enables a compiler to determine whether a body of code is syntactically correct in a given 
computer language. Formal language theory is concerned with languages, grammars, 
and rudiments of combinatorics on strings. The range of applications of formal language 
theory extends from natural and programming languages, developmental biology, and 
computer graphics to semiotics, artificial intelligence, and artificial life. 


16.3.1 ALPHABETS 
Definitions: 

An alphabet is a finite nonempty set. 

A symbol is an element of an alphabet. 

A string in an alphabet is a finite sequence of symbols over that alphabet. 

A word in an alphabet is a finite or countably infinite sequence of symbols over that 
alphabet. 

The empty string A is the string of length zero, that is, the string with no symbols. 
The length of a string w is the number of symbols in w, denoted \w\. 

The frequency |u>| a of a symbol a in a string w is the number of occurrences of a in 
string w. 

A substring of a string w is a sequence of consecutive symbols that occurs in w. 

A subword of a word w is a sequence of consecutive symbols that occurs in w. 

A prefix of a string w is a substring that starts at the leftmost symbol. 

A suffix of a string w is a substring that ends at the rightmost symbol. 

The reverse or mirror image x R of the string x = a±a2 ■ ■ ■ a n , is the string a n . . . a,2ai. 
A palindrome is a string that is identical to its reverse. 

A pseudopalindrome in an alphabet (such as English) that includes punctuation 
symbols (such as comma, hyphen, or blank) is a word that becomes a palindrome when 
all of its punctuation symbols are deleted. 

The concatenation xy of two strings x = a\ci2 ■ ■ ■ a m and y = 6162 • • • b n is the string 
0102 • . . a m &i&2 ■ ■ • b n obtained by appending string y to the right of string x. 

The nth power of a string w, denoted w n , is the concatenation of n copies of w. 
The shuffle xWly of two strings x = xi . . . x n and y = y\ . . . y n is the string x\y\ . . . x n y n . 

Facts: 

1 . A symbol of an alphabet is usually conceptualized as something that it can be 
represented by a single byte or by a written character 

2. A finite word is a string. 
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3. The sum of the frequencies |w| a taken over all the symbols a in the alphabet equals 
the length of the string w. 

4. The length \xy\ of a concatenation equals the sum |x| + \y\ of the lengths of the 
strings x and y from which it is formed. 

5. The length of the nth power of a string w is n ■ |u>|. 

6. When a pseudopalindrome occurs in a natural language, it is commonly called a 
palindrome. 

Examples: 

1. The English alphabet includes lower and upper case English letters, the blank sym- 
bol, the digits 0, 1, . . . , 9, and various punctuation symbols. 

2. ASCII (American Standard Code for Information Interchange ) is an alphabet of 
size 128 for many common computer languages. See Table 1. 

3. ABLE WAS I ERE I SAW ELBA is a palindrome. 

4. The names EVE, HANNAH, and OTTO are palindromes. 

5. MADAM I’M ADAM and SIX AT-NOON TAXIS are pseudopalindromes. 

6. A list of palindromes can be found on the website 

http : //freenet .buffalo . edu/~cd431/palindromes .html 

7. The third power of the string Oil is 011011011. 

8. The concatenation of BOOK and KEEPER is BOOKKEEPER. 

9. The shuffle of FLOOD and RIVER is FRLIOVOEDR. 


16.3.2 LANGUAGES 
Definitions: 

The free monoid V* generated by the alphabet V is the structure whose domain is 
the set of all strings composable from symbols over V, with the semigroup operation of 
string concatenation. 

A ( formal ) language on the alphabet V is a subset L of the free monoid V*. 

The A -free semigroup V + on an alphabet V is the set V*~ {A}, with the concatenation 
operation. 

A A -free language on the alphabet V is a subset of the A-free semigroup V + . 

The length set of a language L is the set length(L) = { \x\ | x £ L }. 

The concatenation L 1 L 2 of two languages is the set { xy \ x £ L\, y £ L 2 }• 

The ith power of a language L is the language L l defined recursively by the rule 
L° = {A} and L i+1 = L l L, i > 0. 

The Kleene closure (or Kleene star) L* of a language L is the union IJ of all 
its powers. 

The positive closure (or Kleene plus) L + of a language L is the union (J );>| L l of 
all its powers excluding the zeroth power. 

The union of two languages L\ and L -2 is Li U L 2 , using the usual set operation. 

The intersection of two languages L\ and L 2 is Li 0^2, using the usual set operation. 
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Table 1 ASCII codes. 


000 0000 

NUL 

010 

0000 

SP 

100 

0000 

@ 

110 

0000 

i 

000 0001 

SOH 

010 

0001 

! 

100 

0001 

A 

110 

0001 

a 

000 0010 

STX 

010 

0010 


100 

0010 

B 

110 

0010 

b 

000 0011 

ETX 

010 

0011 

# 

100 

0011 

c 

110 

0011 

c 

000 0100 

EOT 

010 

0100 

$ 

100 

0100 

D 

110 

0100 

d 

000 0101 

ENQ 

010 

0101 

% 

100 

0101 

E 

110 

0101 

e 

000 0110 

ACK 

010 

0110 

& 

100 

0110 

F 

110 

0110 

f 

000 0111 

BEL 

010 

0111 

5 

100 

0111 

G 

110 

0111 

g 

000 1000 

BS 

010 

1000 

( 

100 

1000 

H 

110 

1000 

h 

000 1001 

HT 

010 

1001 

) 

100 

1001 

I 

110 

1001 

i 

000 1010 

LF 

010 

1010 

* 

100 

1010 

J 

110 

1010 

j 

000 1011 

VT 

010 

1011 

+ 

100 

1011 

K 

110 

1011 

k 

000 1100 

FF 

010 

1100 

5 

100 

1100 

L 

110 

1100 

1 

000 1101 

CR 

010 

1101 

- 

100 

1101 

M 

110 

1101 

m 

000 1110 

SO 

010 

1110 


100 

1110 

N 

110 

1110 

n 

000 1111 

SI 

010 

1111 

/ 

100 

1111 

0 

110 

1111 

o 

001 0000 

DLE 

Oil 

0000 

0 

101 

0000 

p 

111 

0000 

p 

001 0001 

DC1 

Oil 

0001 

1 

101 

0001 

Q 

111 

0001 

q 

001 0010 

DC2 

Oil 

0010 

2 

101 

0010 

R 

111 

0010 

r 

001 0011 

DC3 

Oil 

0011 

3 

101 

0011 

s 

111 

0011 

s 

001 0100 

DC4 

Oil 

0100 

4 

101 

0100 

T 

111 

0100 

t 

001 0101 

NAK 

Oil 

0101 

5 

101 

0101 

u 

111 

0101 

u 

001 0110 

SYN 

Oil 

0110 

6 

101 

0110 

V 

111 

0110 

V 

001 0111 

ETB 

Oil 

0111 

7 

101 

0111 

w 

111 

0111 

w 

001 1000 

CAN 

Oil 

1000 

8 

101 

1000 

X 

111 

1000 

X 

001 1001 

EM 

Oil 

1001 

9 

101 

1001 

Y 

111 

1001 

y 

001 1010 

SUB 

Oil 

1010 


101 

1010 

z 

111 

1010 

z 

001 1011 

ESC 

Oil 

1011 

5 

101 

1011 

[ 

111 

1011 

} 

001 1100 

FS 

Oil 

1100 

< 

101 

1100 

\ 

111 

1100 

1 

001 1101 

GS 

Oil 

1101 

= 

101 

1101 

] 

111 

1101 

} 

001 1110 

RS 

Oil 

1110 

> 

101 

1110 


111 

1110 

~ 

001 1111 

US 

Oil 

1111 

? 

101 

1111 

- 

111 

1111 

DEL 


Control codes: ACK: acknowledge, BEL: bell, BS: backspace, CAN: cancel, CR: 
carriage return, DC1-4: device controls, DEL: delete, DLE: data link escape, EM: 
end of medium, ENQ: enquiry, EOT: end of transmission, ESC: escape, ETB: end 
of transmission block, ETX: end of text, FF: form feed, FS: file separator, GS: 
group separator, HT: horizontal tab, LF: line feed, NAK: negative acknowledgment, 
NUL: null, RS: record separator, SI: shift in, SO: shift out, SOH: start of heading, 
SP: space, STX: start of text, SUB: substitute, SYN: synchronous /idle, US: united 
separator, VT: vertical tab 


The complement of a language L over an alphabet V is the language L , where com- 
plementation is taken with respect to the free monoid V* as the universe of discourse. 

A language is regular if it is any of the languages 0, {A}, or { 6} , where b is a symbol of 
its alphabet, or if it can be obtained by applying the operations of union, concatenation 
and Kleene star finitely many times to one of those languages. 
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The shuffle Li\ilL 2 of two languages Li,L 2 is the language { w € V* w = xW\y, for 
some x € Li, y G L 2 }. 

The mirror image mi(L) of the language L is the language { x R \ x G L}. It is also 
called the reverse of the language L. 

The left quotient of the language L\ with respect to the language L 2 on the same 
alphabet V, is the language L 2 \L\ containing every string of V* that can be ob- 
tained from a string in L\ by erasing a prefix from L 2 . That is, L 2 \L\ = { w G 
V* | there is x G L 2 such that xw £ Ii}. 

The left derivative of language L with respect to the string x over the same alphabet V 
is the language d x (L) = {x}\L. 

The right quotient is the notion symmetric to left quotient. 

The right derivative is the notion symmetric to left derivative. 

A substitution for the alphabet V in the alphabet U is a mapping s: V — > 2 U . This 
means that each symbol b G V may be replaced by any of the strings in the set s(b). 

A Unite substitution is a substitution such that the replacement set s(a ) for each 
symbol a G V is finite. 

The extension of a substitution s: V — > 2 U from its domain alphabet V to the 
set V* of strings over V is given by the rules s(A) = {A} and s(ax) = s(a)s(x), for 
ciGV,x GV*. 

A morphism from the alphabet V to the alphabet U is a substitution s: V — > 2 U 
such that the replacement set s(a) for every symbol a GV is a singleton set. 

A A -free substitution is a substitution such that A is never substituted for a symbol. 
That is, A /&(a), for every symbol a G V. 

A A -free morphism is a morphism such that s(a) ^ {A}, for every symbol a G V. 

The extension of a substitution s: V — > 2 U to the language L C V* is the language 
s(L) = U xeL s( x) that contains every string in U* obtainable from a string in L by 
making replacements permissible under substitution s. 

The inverse of a morphism h: V* — > U* is the mapping hr 1 : U* — > 2 V defined by 
h~ 1 (x) = {yGV* \h(y) = x}, xGU*. 

A family of languages is nontrivial if it contains at least one language different from 0 
and {A}. 

Facts: 

1. The set of all binary strings with at least as many Is as Os is a language. 

2. The set of all binary strings in which no two occurrences of 1 are consecutive is a 
language. 

3. Some strings of a natural language such as English are categorized as nouns, verbs, 
and adjectives. Other more complicated strings are categorized as sentences. 

4. Some strings of common computer languages are categorized as identifiers and arith- 
metic expressions. Other more complicated strings are categorized as statements, with 
subcategories such as assignment statements, if-statements, and while-statements. 

Examples: 

1. Natural languages and computer languages are formal languages. 

2. The Kleene closure of the language {00, 01, 10, 11} is the language of all strings of 
even length. 
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3. The left derivative {bee}\English includes the following strings: f, n, p, r, s, t, tie, 
ts, keeper, swax, feater, ping. 

4. The substitution 0 i — > ({0,01},1 i — > {0,11} over the free monoid {0,1}* is the 
language of all strings of even length. 

5. Given the alphabet {a, 6}, define the morphism <j>: {a, b} — > {a, b}* by the replace- 
ments <j){a) = ab and = ba , and define the string w n by the recursion wo = a and 
w n+ i = (j){w n ). Then, 

w\ = ab, W2 = abba, W3 = abbabaab, W4 = abbabaabbaababba , .... 

6 . In Example 5 each word w n is a prefix of the next word w n +\. The Thue a) -word is 
the infinite word limn—^ w n . 

7 . Given the alphabet {a, 6}, define the morphism p: {a, 6} — > {a, b}* by the replace- 
ments p{a) = ab and p(b) = a, and define the string w n by the recursion w 0 = a and 
w n+ i = <p(w n ). Then, 

Wi = ab, W2 = aba, W3 = abaab, W4, = abaababa , .... 

8. In Example 7 each word w„ is a prefix of the next word w n + 1 . The Fibonacci cc-word 
is the infinite word linin^oo w n . 

9. A language is regular if and only if it is the language of strings accepted by some 
finite state recognizer. 


16.3.3 GRAMMARS AND THE CHOMSKY HIERARCHY 
Definitions: 

A phrase-structure grammar (or unrestricted grammar or type 0 grammar) 

is a quadruple G = ( N , T, S, P) such that: 

• N is a finite nonempty alphabet of symbols called nonterminals; 

• T is a finite nonempty alphabet, disjoint from N, of symbols called terminals; 

• S is a nonterminal called the start symbol ; 

• P is a finite set of production rules of the form a — » (3, where a is a string in 

TV U T that contains at least one nonterminal and /? is a string in N U T . 

The antecedent of a production a — > f3 is a. 

The consequent of a production a — > (3 is /?. 

The string y is directly derivable from the string x with respect to the grammar G 
if there is a production rule u — > v £ P and if there are strings W\ , W2 £ (TV U T)* such 
that x = W 1 UW 2 and y = W\VW 2 - 

The direct derivability relation x =>g V (or x => y, when the grammar G is 
implicitly understood) means that y is directly derivable from string x. 

A derivation of the string y from the string a; is a sequence of direct derivations 
x ==> z\, z\ => Z2,...,z n => y. This is sometimes called parsing. 

The string y is derivable from the string x with respect to the grammar G if there is 
a derivation of y from x. Notation: x =>* y. 

The Chomsky normal form for a production rule is A — * BC, where B and C 
are nonterminals or the form A — > a where a is a terminal. 

The language generated by the grammar G is the language L(G) = { x £ T* \ 

.S' -t->* | . 
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Grammars G i and Gi are equivalent if L(G\) = L(G 2 ). 

A leftmost derivation x =>i e ft y is a derivation x =>■ y in which at each step the 
leftmost nonterminal is replaced. 

The leftmost language generated by the grammar G is the language Li e f t (G) of 
strings of terminals with leftmost derivations from the start symbol S. 

A grammar G = (N,T, S, P) is length-increasing (or of type 1) if \u\ < |r;| for all 
u — > v £ P. (However, the production S — > A is allowed, provided that S does not 
appear in the consequents of rules in P.) 

A grammar G = ( TV , T, S, P) is context-sensitive if for each production u — > v £ P, 
the antecedent and consequent have the form u = U 1 A 112 and v = u 1 XU 2 , for Ui,U 2 € 
( TV U T)*,A € TV, x € (TV UT) + . (The production S — > A is allowed, provided that S 
does not appear in the right-hand members of rules in P.) 

A grammar G = (TV, T, S , P) is context-free (or of type 2) if the antecedent of each 
production u — > v € P is a nonterminal. 

An L-system is a production-based model for growth and life development. 

A grammar G = ( N , T, S, P) is monotonic if the consequent of each production (except 
possibly S — > A) has at least as many symbols as the antecedent, and S does not occur 
in any consequent. 

A grammar G = (N, T, S, P) is linear if each production u — > v £ P has its antecedent 
u € N and its consequent v € T* U T* NT*. 

A grammar G = (TV, T, S, P) is right-linear if each production u — > v £ P has u £ TV 
and v G T* U T*N. 

A grammar G = (TV, T, S, P) is left-linear if each production u — > v £ P has u £ TV 
and v G T* U NT*. 

A grammar G = (TV, T, S, P) is regular (or type 3) if each rule u — > v € P has u € TV 
and v € T U TTV U {A}. 

Given a class of grammars, there are some basic decision problems about arbitrary 
grammars Gi,G 2 in the class: 

equivalence: are the grammars Gi and G 2 equivalent? 
inclusion: is the language L(Gi) included in the language L{G2)1 
membership: given an arbitrary string x, is x an element of L(Gi)? 
emptiness: is the language L(G 1 ) empty? 
finiteness: is the language L(Gi) finite? 
regularity: is L(Gi) a regular language? (see §16.3.2) 

The recursive languages are the languages with a decidable membership question. 

The various classes of languages are denoted as follows: 

RE (type 0): the class of all unrestricted languages; 

CS (type 1): the class of all context-sensitive languages; 

CF (type 2): the class of all context-free languages; 

LIN: the class of all linear languages; 

REG (type 3): the class of all regular languages. 
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Table 2 Closure properties for Chomsky hierarchy classes. 



RE 

CS 

CF 

LIN 

REG 

union 

yes 

yes 

yes 

yes 

yes 

intersection 

yes 

yes 

no 

no 

yes 

complement 

no 

yes 

no 

no 

yes 

concatenation 

yes 

yes 

yes 

no 

yes 

Kleene star 

yes 

yes 

yes 

no 

yes 

intersection with 

regular languages 

yes 

yes 

yes 

yes 

yes 

substitution 

yes 

no 

yes 

no 

yes 

A-free substitution 

yes 

yes 

yes 

no 

yes 

morphisms 

yes 

no 

yes 

yes 

yes 

A-free morphisms 

yes 

yes 

yes 

yes 

yes 

inverse morphisms 

yes 

yes 

yes 

yes 

yes 

left/right quotient 

yes 

no 

no 

no 

yes 

left/right quotients with 
regular languages 

yes 

no 

yes 

yes 

yes 

left/right derivative 

yes 

yes 

yes 

yes 

yes 

shuffle 

yes 

yes 

no 

no 

yes 

mirror image 

yes 

yes 

yes 

yes 

yes 


Facts: 

1. Chomsky hierarchy: The following strict inclusions hold: 

REG c LIN c CF c CS c RE. 

2. The language of an unrestricted grammar is recursively enumerable (RE). 

3. CS (context sensitive) C {recursive languages} C RE (unrestricted). 

4. The class of languages generated by monotonic grammars is identical to the class of 
languages generated by context-sensitive grammars. 

5. L-systems were introduced in 1968 by Aristicl Lindenmayer (1922-1990), a Dutch 
biologist, to model the development of some plant systems. (See [Gr97].) 

6. The classes of languages generated by right-linear or by left-linear grammars coin- 
cide. This class is identical to the family of languages generated by regular grammars, 
as well as to the class of regular languages (§16.3.2). 

7. Li e ft{G) € CF (context-free) for each type-0 grammar G. 

8. If G is a context-free grammar, then Li e f t (G) = L(G). 

9. Let G be a context-free grammar. Then there is a grammar G' = (N,T : S, P), 
with every rule in Chomsky normal form. Moreover, there is constructive method for 
transforming grammar G into the grammar G' . 

10. Rice’s theorem: Let P be a nontrivial property of recursively enumerable languages 
(i.e. , a property such that there exists at least one grammar having property P and at 
least one grammars not having property P). Then property P is undecidable. 

11 . A language is context-free if and only if it is the language accepted by some (pos- 
sibly nondeterministic) pushdown automaton. 
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12. The following table summarizes the decidability properties of the grammar classes 
in the Chomsky hierarchy. In this table U stands for undecidable, D for decidable, and 
T for trivial. 



RE 

(type 0) 

(type 1) 

CF 

(type 2) 

LIN 

REG 

(type 3) 

equivalence 

U 

U 

U 

U 

D 

inclusion 

U 

U 

U 

U 

D 

membership 

u 

D 

D 

D 

D 

emptiness 

u 

U 

D 

D 

D 

finiteness 

u 

U 

D 

D 

D 

regularity 

u 

u 

U 

U 

T 

intersection 

yes 

yes 

no 

no 

yes 

complement 

no 

yes 

no 

no 

yes 


Examples: 

1. In the grammar G = (N, T, S, P ), where N = {S', x, y}, T = {0, 1}, and P = {S — > 
0S1, S — > A}, a derivation of the string 0011 is S => 0S1 =$■ 00S11 => 0011. 

2. The following are examples of languages generated by grammar G = (N,T, S, P) 
with N = {S,x,y,z}, T = {0,1,2}, and the following sets P of productions: 


production set P 

language L(G ) 

class 

S — » Cte, x — y 1 y, y bx, x — » 1, y — » A 

{01,0101,010101,...} 

regular 

S — > A, S — > Ox, S — > 01, x — > £1 

{0 n l n | n > 0} 

linear 

S -► A, S -> 05x2, 2x -> x2, Ox 01, lx -» 11 

{0 n l”2 n | n > 0} 

unrestricted 


16.3.4 REGULAR AND CONTEXT-FREE LANGUAGES 
Definitions: 

Given an alphabet V , a regular expression over V is a string w over the alphabet 
V U { e, ), (,+,*} that has one of the following forms: 

• w € V or w = e; 

• w = (a/3), where a and (3 are regular expressions; 

• w = (a + j3), where a and (3 are regular expressions; 

• w = a* , where a is a regular expression. 

The set of all regular expressions over alphabet V is denoted rexy. 

The function L maps rexy to the set of all languages over the alphabet V, using the 
following rules: 

• L(e) = 0, and L(a) = {a} for all a € V; 

• L((a(3)) = L(a)L((3), L((a + (3)) = L(a) U L(/3), and L(a*) = (L(a))* . 
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A context-free grammar G is ambiguous if there is string x £ L(G) having two different 
leftmost derivations in G. 

A context-free language L is inherently ambiguous if every context-free grammar 
of L is ambiguous; otherwise, language L is called unambiguous. 


Facts: 

1 . Kleene theorem: A language L is regular if and only if there is a regular expression e 
such that L = L(e). 

2 . Every context-free language over a one- letter alphabet is regular. 

3 . Every regular language L can be represented in the form L = hi(h^ 1 (h 2 (hf[ 1 (a*b)))), 
where h\ , h‘ 2 , h;$. hi are morphisms. 

4. Each regular language is unambiguous. 

5. There are inherently ambiguous linear languages. 

6 . The ambiguity problem for context-free grammars is undecidable. 

7. The length set of a context-free language is a finite union of arithmetical progres- 
sions. 

8 . Every language L can be represented in the form L = h(L\ n L 2 ), as well as in the 
form L = Lz\Li, where h is a morphism and L\, L 2 , L 3 , L 4 are linear languages. 

9 . Pumping lemma for regular languages: If L is a regular language over the alpha- 
bet V, then there are numbers p and q such that every string z £ L with length \z\ > p 
can be written in the form 2 = uvw, with u,v,w € V*, where |uu| < q, v ^ A, so that 
uv l w £ L for all i > 0. 

10 . Pumping lemma for linear languages: If L is a linear language on the alphabet V, 
then there are numbers p and q such that every string z £ L with length \z\ > p can be 
written in the form 2 = uvwxy , with u,v,w,x,y € V* , where \uvxy\ < q and vx A, 
so that uv l wx l y £ L for all i > 0. 

1 1 . Bar-Hillel (uvwxy, pumping) lemma for context-free languages: If L is a context- 
free language over the alphabet V, then there are numbers p and q such that every string 
z £ L with length \z\ > p can be written in the form 2 = uvwxy, with u, v, w,x,y £ V* , 
where |uuia;| < q and vx ^ A, so that uv l wx l y £ L for all? > 0. 

12 . Ogden pumping lemma (pumping with marked positions): If L is a context-free 
language on the alphabet V, then there is a number p such that for every string 2 £ L and 
for every set of at least p marked occurrences of symbols in 2 , we can write 2 = uvwxy, 
where: 

• either each of u, v, w or each of w, x, y contains at least one marked symbol; 

• vwx contains at most p marked symbols; 

• uv l wx l y € L for all* > 0. 

13 . Let G be a context-free grammar G. Then there is a grammar G' = (N,T, S, P), 
with every rule in P of the form A — > aa, for A £ N, a £ T, a £ (N U T)*, such 
that L(G') = L(G) — {A}. Moreover, there is constructive method for transforming 
grammar G into the grammar G' , which is said to be in the Greibach normal form. 
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14 . Let G be a context-free grammar G and (/c,/,m) a triple of nonnegative integers. 
Then an equivalent grammar G' = (N, T, S, P) can be effectively constructed whose 
every rule is in one of the following two forms: 

• A — > xByCz , with A,B,C £ N, x,y,z £ T*, and \x\ = k , \y\ = l, \z\ = m; 

• A — ■> x, with A £ N,x £ T*, \x\ £ length(L(G)). 

Such a grammar G' is said to be in super normal form. 

15 . Variants of the Chomsky and Greibach normal forms can be obtained by particu- 
larizing the parameters k, l, m in the super normal form. 


Examples: 

1. The following are some regular expressions over {0, 1} and the languages they rep- 
resent: 


1 * 

1 * 01 * 

l *(0 + e ) l * 

(0 + l)(0 + l ) 

(0 + 1)(0 + 1 + e ) 


all strings with no Os 

all strings with exactly one 0 

all strings with one or no Os 

all strings of length 2 

all strings of length 1 or 2. 


2 . Backus-Naur form (BNF) (or Backus normal form ) for specifying computer language 
syntax uses context-free production rules. Nonterminals are enclosed in brackets; the 
symbol ::= is used in place of — ■>; and all the consequents of the same antecedent are 
written in the same statement with the alternative consequents separated by vertical 
bars. For instance, in some programming languages, this might be the BNF for the 
lexical token called an identifier. 


(identifier) ::== (letter) | (letter) (alphameric string) 

(letter) ::= a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z 
(alphameric string) ::= (alphameric) | (alphameric string) (alphameric) 
(alphameric) ::= (letter) | (digit) 

(digit) ::= 0|1|2|3|4|5|6|7|8|9 


16.3.5 COMBINATORICS ON WORDS 

Note: In this subsection, a word is taken to be finite. 

Definitions: 

A (word) variable over an alphabet V is a symbol (such as x or y) not in V whose 
values range over V* . 

A pattern in a word is a string of word variables. 

A pattern is present in a word w £ V* if there exists an assignment of strings from V + 
to the variables in that pattern such that the word formed thereby is a substring of w. 

A square is a word of the pattern 11 xx” . 

A square-free word is a word with no subwords of the pattern u xx v . 
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A cube is a word of the pattern “xxx”. 

An Abelian square is a word of the form xx p , where x p is any permutation of the 
word x. 

A word equation over an alphabet V is an expression of the form a = (3 such that a 
and (3 are words containing letters of an alphabet V and some variables over V. 

A word inequality is the negation of a word equation, which is commonly written in 
the form a ^ (3. 

A solution to a system S of (finitely many) word equations and word inequalities is 
a list of words whose substitutions for their respective variables converts every word 
equation and word inequality in the system into a true proposition. 

A code is a nonempty language C C V + such that whenever a word w in V can 
be written as a catenation of words in C, the write-up is always unique. That is, if 
w = x i . . . x m = yi . . . y n , where m, n > 1, and Xi, yj € C, then m = n and Xi = yt for 
i = l, ... ,m. This property is called unique decodability. 

The code indicator of a word w £ V* is the number ci(w) = V' r ©'" ! L 

The code indicator of a language is the sum of the code indicators of all words in 
the language. 

Facts: 

1. Certain patterns are unavoidable in sufficiently long words. 

2. Squares are avoidable in alphabets with three or more letters; that is, there are 
arbitrarily long square-free words. 

3. Cubes are avoidable over two letter alphabets. 

4. Although squares are avoidable in three letter alphabets, Abelian squares are un- 
avoidable. Every word of length > 8 over V = {a, b , c} contains a subword of the form 
.rx , .c £ V + , where x p is a permutation of x. 

5. Abelian squares are avoidable in alphabets with four or more letters. 

6. It is decidable (by the so-called Makanin’s algorithm) whether or not a system S of 
word equations and inequalities has a solution. 

7. It is decidable whether or not a given finite language is a code. 

8. Every code C satisfies the inequality ci(C ) < 1. 

9. If a language C = {wq, . . . , w n } over V is not a code then, according to the so-called 
defect theorem, the algebraic structure of C* can be simulated by an alphabet with at 
most n—1 letters: the smallest free submonoid of V* containing C is generated by at 
most n—1 words. 

10. The following three conditions are equivalent for any two words u and v: 

• {u, v} is not a code; 

• u and v are powers of the same word; 

• uv = vu. 

(This is a corollary to Fact 9.) 

11. For every word w £ V + , there are a unique shortest word p(w) and an integer 
n > 1 such that w = ( p(w)) n . (The word p(w) is called the primitive root of w.) 

12. Lyndon’s theorem: If uv = vw with u,v,w £ V* , then there exist words x, y £ V* 
and a number n > 0 such that u = xy, w = yx and v = ( xy) n x = x{yx) n . 
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13. If uv = vu with u, v € V + , then p(u ) = p(v) and, consequently, u and v are powers 
of the same word. This is a corollary to Lyndon’s theorem. 

14. Assume that words u m and v n have a common prefix or suffix of length |u| + M — d, 
where u,v € V + ,m,n > 1 and d = gcd(|u|, |u|). Then p(u) = p(v) and |p(u)| < d. Thus, 
if d = 1 then u and v are powers of the same letter. 

15. If u m = v n , where m, n > 1, then u and v are powers of the same word. (This is a 
corollary to Fact 14.) 

16. If u m v n = w p , where m,n,p > 2, then p(u) = p(v) = p(w). 

Examples: 

1. In the alphabet V = {a, b}, the only square-free three-letter words are aba and bab. 
The two possible extensions of aba by one letter are abaa , which contains the square aa, 
and abab , which is a square. Similarly, both extensions of bab by one letter contain a 
square. Thus, squares are unavoidable in words of length > 4 over two-letter alphabets. 

2. All solutions for the system xaba = abax,xx ^ x,x ^ aba, over the alphabet 
V = {a, b} are (by the corollary to Lyndon’s theorem) of the form x = ( aba) n , n > 2. 


1 6.4 ALGORITHMIC COMPLEXITY 

The “complexity of an algorithm” has come to mean, most often, a measure of the 
computational effort or cost of execution, relative to the “size” of the problem. Other 
factors that may affect this kind of complexity are the characteristics of the particu- 
lar input and the values returned by random number generators. The most common 
complexity measure is running time, but other measures, such as space utilized and 
number of comparisons are sometimes used. Another view of complexity focuses on the 
complicatedness of the algorithm, rather than on the effort needed to execute it. 


16.4.1 OVERVIEW OF COMPLEXITY 

To simplify discussion, it is assumed that every function and algorithm under consider- 
ation here has one argument. (Everything is easily generalized to multivariate functions 
by regarding the list of arguments as an n-tuple.) 


Definitions: 

A function f:J\f — > Af is asymptotic to a function g\N—>N if lim^^oo = 1. 
Notation: f(n) ~ g(n). (See §1.3.3.) 

The input size of the argument of an algorithm is either its numeric value or the 
number of bits required to specify a value of that argument. 

A ( cost-based ) complexity measure for an algorithm is any of several different 
asymptotic measures of cost or difficulty in running that algorithm, relative to the 
input size. It is given in big-O notation (or sometimes in ©-notation). (See §1.3.3.) 
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A time- complexity measure of an algorithm is a big-O expression for the number 
of operations or the running time needed for a complete execution of that algorithm, 
represented as a function of the size of the input. 

A space-complexity measure of an algorithm is a big-0 expression for the amount 
of computational space needed in the execution of that algorithm, represented as a 
function of the size of the input. 

An algorithm runs in polynomial time if its time-complexity is dominated by a poly- 
nomial. 

An algorithm runs in polynomial space if its space-complexity is dominated by a 
polynomial. 

A Kolmogorov-Chaitin complexity measure of an algorithm is a measure based 
on the number of instructions of the algorithm, which is taken as an estimate of the 
logical complicatedness. 

The time-complexity of a computable function is the minimum time-complexity taken 
over all algorithms that compute the function. 

The parallel time-complexity of a computable function is the minimum time-com- 
plexity taken over all parallel algorithms that compute the function. 

The space-complexity of a computable function is the minimum space-complexity 
taken over all algorithms that compute the function. 

A decision function is a function on a countably infinite domain that decides whether 
an object is in some specified subset of that domain. 

A computable decision function is in class P ( polynomial ) if its time-complexity is 
polynomial. 

A computable decision function is in class NP ( nondeterministic polynomial ) if its 
parallel time-complexity is polynomial. 

A function g reduces a decision function h to a decision function / if h = f o g. 

A computable decision function / is NP-iiard if every decision function in class NP 
can be reduced to / by a polynomial-time function. 

A computable decision function is NP-complete if it is NP-harcl and in class NP. 

A tractable problem is a set membership problem with a decision function in class P . 

Facts: 

1. The previous definitions can be rephrased in terms of problems and algorithms: 

• a problem is in class P (or tractable ) if it can be solved by an algorithm that 

runs in polynomial time; 

• a problem is in class NP if, given a tentative solution (obtained by any means), 

it is possible to check that the solution is correct in polynomial time; 

• a problem is NP-complete if it is in class NP and NP-hard. 

2. When considering whether a given problem belongs to P or NP, and whether it 
might be NP-complete, it is helpful to rewrite the problem, or an associated problem, 
as a decision problem (which has a yes/no answer) because decision problems have 
been easier to characterize and classify than general problems. For example, see the 
description of the traveling salesman problem in Example 3 in this section. 
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3 . Time-complexity of sorting algorithms is typically measured according to the number 
of comparisons needed. 

4 . The words good, efficient, and feasible are commonly used interchangeably to mean 
polynomial-time . 

5 . Additive and multiplicative constants that are ignored in big-O analysis of an algo- 
rithm can sometimes be too large for practical application. 

6. That a problem belongs to P does not necessarily imply that it can be solved in a 
practical amount of time, since the polynomial bound of its complexity can be of high 
degree. Fortunately, however, for most problems in P arising in practical applications, 
the polynomial bound is of relatively small degree. 

7 . Belonging to class NP means that a solution can be checked in polynomial time, 
but not necessarily found in polynomial time. 

8. When a problem is in class NP, it may be possible to solve the problem for cases 
arising in practical applications in a reasonable amount of time, even though there are 
other cases for which this is not true. Also, such problems can often be attacked using 
approximation algorithms which do not produce the exact solution, but instead produce 
a solution guaranteed to be close in some precise sense to the actual solution sought. 

9 . Every problem in class P is in class NP. 

10. It often requires only a small change to transform a problem in class P to one in 
class NP. For example, the first four problems in Example 2 (Euler graph, edge cover, 
linear Diophantine equation, 2-satisfiability) are in class P, but the similar first four 
problems in Example 3 (Hamilton graph, vertex cover, quadratic Diophantine equation, 
3-satisfiability), each of which results from seemingly small changes in the respective 
problem from class P, are in class NP. 

11 . To show a problem is NP-complete, the problem can be transformed (in a specific 
way) to a problem already known to be NP-complete. This is often much easier than 
showing directly that the problem is NP-complete. See [GaJo78] for details. 

12. If there is an NP-hard problem that belongs to P, then P = NP. 

13. Not all NP problems are NP complete. (See Example 4 for such a problem.) 

14 . Deciding whether P = NP is the outstanding problem in the theory of computa- 
tional complexity. It is the common belief that P ^ NP, based on an extensive search 
for polynomial-time solutions to various NP problems. 

15 . The first problem to be shown to be NP-complete was the satisfiability problem 
(Example 3). That the satisfiability problem is NP-complete is called Cook’s theorem , 
after Steven A. Cook, who discovered it in 1971. [Co71] 

16 . In 1972 Richard Karp proved that the traveling salesman problem (TSP) (and 
many others) were NP-complete. [Ka72] 

17 . Currently, over 2500 problems (in many areas, including mathematics, computer 
science, operations research, physics, biology) are known to be NP-complete. Further 
information on NP-complete problems can be found in the “NP-completeness column: 
an ongoing guide”, authored by David S. Johnson, in the Journal of Algorithms. See 
[Jo81] for the first such column. 
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18 . Extensive information on NP-completeness (methods of proof, examples, etc.) can 
be found in [At99], [GaJo78], [Tu97], and [va90]. 

19 . A more formal approach to complexity, given in terms of Turing machines, appears 
in §16.5. 


Examples: 

1. The following table gives some different input size variable for different problem 
problem types: 


problem type 

typical input size parameters 

database sorting 
graph algorithms 
arithmetic computation 
convex hull construction 

number of records 

number of vertices and/or number of edges 
numbers of digits in the numerals 
number of points 


2. The following problems are in class P: 

• Euler graph: given a graph, determine whether the graph has an Euler circuit; 

• edge cover: given a graph G and positive integer n, determine whether there is a 

subset E of edges of G with \E\ < n and every vertex of G an endpoint of an 
edge in E\ 

• linear Diophantine equation: given positive integers a, 6, c, determine whether 

ax + by = c has a solution in positive integers x and y: 

• 2-satisfiability: given a Boolean expression in conjunctive normal form in which 

each sum contains only two variables, determine whether the expression is “sat- 
isfiable” (i.e., there is an assignment of 0 and 1 to the variables such that the 
expression has value 1); 

• circuits: given a graph G and positive integer n, determine whether there is a 

subset E of edges of G with \E\ < n such that each circuit in G contains an 
edge in E\ 

• linear programming: maximize cx subject to Ax < b where A is a given q x n 

matrix, c is a given row vector of length n, and 6 is a given column vector of 
length q (see §15.1.1). 

3. The following problems are NP-complete: 

• Hamilton graph: given a graph, determine whether the graph has a Hamilton 

circuit; 

• vertex cover: given a graph G and positive integer n, determine whether there 

is a subset V of vertices of G with |Vj < n with every edge of G having an 
endpoint in V; 

• quadratic Diophantine equation: given positive integers a, b, c, determine whether 

the equation ax 2 + by = c has a solution in positive integers x, y; 

• 3-satisfiability: given a Boolean expression in conjunctive normal form in which 

each sum contains only three variables, determine whether the expression is 
“satisfiable” (i.e., there is an assignment of 0 and 1 to the variables such that 
the expression has value 1); 
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• satisfiability : given a Boolean expression in conjunctive normal form, determine 

whether the expression is “satisfiable” (i.e. , there is an assignment of 0 and 1 
to the variables such that the expression has value 1) (see Fact 15); 

• traveling salesman problem given a weighted graph and positive number k, de- 

termine whether there is a Hamilton circuit of weight at most k (see §10.7.1); 

• independent vertex set: given a graph G and a positive integer n, determine 

whether G contains an independent vertex set of size at least n; 

• knapsack problem: given a set S, values a,; and bi for each i £ S, and numbers o 

and 6, determine whether there is a subset T C S such that £ ieT a-i < a and 
Z ieT b i9 eb (see §15.3.1); 

• bin packing problem: given k bins (each of capacity c) and a collection of weights, 

determine whether the weights can be placed in the bins so that no bin has its 
capacity exceeded (§15.3.2); 

• 3-coloring: given a graph G, determine whether its vertices can be colored with 

3 colors; 

• clique problem: given a graph G and positive integer n, determine whether G has 

a clique of size at least n; 

• dominating set: given a graph G and positive integer n, determine whether G 

has a dominating set of size at most n; 

• graph isomorphism: given two graphs, determine whether they are isomorphic. 

4 . The following problem is an NP problem, but not NP complete: given vertices v, w 
in graph G, determine whether v and w are joined by a path in G. 


16.4.2 WORST-CASE AND AVERAGE-CASE ANALYSIS 

Definitions: 

A worst-case complexity measure of an algorithm is based on the maximum com- 
putational cost for any input of that size. It is usually expressed in big-O asymptotic 
notation (or sometimes ©-notation) as a formula based on the input size variables. 

An average-case complexity measure of an algorithm is based on the expected 
computational cost over a random distribution of its inputs of a given size. 


Facts: 

1. Algorithmic analysis of deterministic algorithms often assumes a uniform random 
distribution of the possible inputs, when the actual distribution is unknown. 

2 . For sorting algorithms, an average-case analysis may assume that all input permuta- 
tions of the keys to be sorted are equally likely. In practice, however, some permutations 
may be far more likely than others, e.g., already sorted, almost sorted, or reverse sorted. 

3 . The input size measures for average-case analysis are usually the same as for worst- 
case analysis. 
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Examples: 

1. The following table gives the worst-case running times of some sorting algorithms 
[CoLeRi90], where the input size parameter n is the number of records: 


sorting method 

worst-case complexity 

insertion sort 

0(n 2 ) 

selection sort 

0(n 2 ) 

bubble sort 

0(n 2 ) 

heapsort 

0(nlogn) 

quicksort 

0(n 2 ) 

mergesort 

0(nlog?i) 


2. The following table gives the worst-case running times of some graph algorithms, 
based on input size parameters \ V\ and \E\, which are the numbers of vertices and edges: 


graph algorithm 

worst-case complexity 

Kruskal’s MST algorithm 

Dijkstra’s shortest-path algorithm 
with linked-list priority queue 
Dijkstra’s shortest-path algorithm 
with heap-based priority queue 
Dijkstra’s shortest-path algorithm 
with Fibonacci-heap priority queue 
Edmonds-Karp max-flow algorithm 

0(\E\\og\V\) 

0(|E| 2 ) 

0(|£jlog|V|) [CoEtal90] 

0(\E\ + \V\ log \V\) [CoEtal90] 
0( \V\-\E\ 2 ) 


3. The following table gives the worst-case running times of some plane convex hull 
algorithms (§13.5.1), based on the number n of points supplied as input: 


convex hull algorithm 

worst-case complexity 

Graham scan 

Jarvis march ( “gift- wrapping” ) 
QuickHull 

MergeHull 

0(nlogn) 

Q(nh .), ft. = # corners (convex hull) 
0(n 2 ) 

0(n\ogn ) 


4. The following table gives the average-case running times of some sorting algorithms, 
where the input size parameter n is the number of records: 


sorting method 

average-case complexity 

insertion sort 

0(n 2 ) 

selection sort 

0(n 2 ) 

bubble sort 

0(n 2 ) 

heapsort 

0(?ilogn ) 

quicksort 

0(?rlog?r) 

mergesort 

0(?rlogn) 
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Algorithm 1 : Randomized quicksort. 

procedure randomized-quicksort{A, p, r) 

if p < r then 
begin 

i := random{p, r) 
exchange A[p\ and A[i\ 
q := partition(A,p,r) 
randomized- quicksort^ A, p , q) 
randomized- quicksort(A, q + 1, r) 
end {subarray A[p . . r] is now sorted} 


5. Randomized quicksort (Algorithm 1) [CoLeRi90]: A subarray from index p to 
index r of an array A is sorted, using an external subroutine random(p , r) that generates 
a number in the set { p , . . . ,r} within 0(1) worst-case running time. Another external 
subroutine partition(A, p,r) rearranges the subarray A[p..r\ and returns an index q, 
p < q < r, such that for i = p, . . . , q, A[i] < A[q\ and such that for i = q + 1, . . . , r, 
A\i\ > A[q]\ this subroutine runs in @(r — p) worst-case time. 

To sort n keys, randomized quicksort takes @(?i 2 ) time in the worst case (when 
unlucky enough to have partition sizes always unbalanced), but only 0(?rlogn) time in 
the average case (partition sizes are usually at least a constant fraction of the total). 

6. Convex hull: For certain distributions of n points in the plane, the expected 
value E[h] of the number of vertices on the convex hull, is known. This bound im- 
plies that the average-case running time of Jarvis’s march is an additional factor of n 
greater: 


distribution 

m 

average-case 
running time 

uniform in convex polygon 

O(logn) 

0{n log n) 

uniform in circle 

0(n 5) 

O(ns) 

normal in plane 

0(y/logn) 

0{n\/ log n) 


16.4.3 LOWER BOUNDS 

Lower bounds on running times of algorithms are typically given as functions of input 
size using SJ-notation (§1.3.3). 

Definitions: 

An existential lower bound for an algorithm is a lower bound for its running time 
that holds for at least one input. 

An existential lower bound for a problem is a lower bound for every algorithm that 
could solve that problem. 

A comparison sort is a sorting method that rearranges records based only on com- 
parisons between keys. 
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The Euclidean minimum spanning tree (or Euclidean MST) problem has as 

input vertices a set of n points in the plane and as output a spanning tree of minimum 
total edge-length. 

A reduction of a problem A to another problem B is the following sequence of steps: 

• the input to problem A is transformed into an input to problem B; 

• problem B is solved on the transformed input; 

• the output of problem B is transformed back into a solution to problem A for 

the original input. 

An /(n) time reduction of problem A to problem B is a reduction such that the time 
for the three steps together is f(n). 

Facts: 

1. For a given model of computation, if problem A has a lower bound of T(n ) and it 
reduces in f(n) time to problem B, then problem B has a lower bound of T(n) — /(n). 

2 . Every comparison sort on n records requires fi(nlogn) comparisons in the worst 
case. 

3 . Computing the Euclidean minimum spanning tree on n points takes fl(nlogn) time 
in the worst case. 

4. Unlike the Euclidean MST problem, most graph problems have no known nontrivial 
lower bound. Some graph algorithms , however, have lower bounds on their implemen- 
tation. 

5. Running Dijkstra’s algorithm (§10.3.2) on a directed graph with \V\ vertices takes 
fi(|V| log |V|) time in the worst case. 

6. Finding the vertices for the convex hull of n points in the plane, in any order, takes 
fl(n\ogn) time in the worst case. 

7. Constructing the Voronoi diagram (§13.5.2) on n points in the plane takes in the 
worst case fl(nlogn) time. 

Examples: 

1. An 0{n)-time reduction of sorting to a gift-wrap of a convex hull: Given a set 
of n positive numbers {xi, . . . , x n }, first produce in 0(rc) time their respective squares 
{ x \ , . . . , x„}. Since each point (aq, x 2 ) lies on the parabola given by y = x 2 , the Jarvis 
march on the convex hull of the points (xj, x 2 ) is a list of points, ordered by abscissa. 
Sequentially read off the first coordinate of every point of the convex hull in 0(n) time, 
thereby producing the sorted list of numbers. This implies that finding the gift-wrapped 
convex hull of n points requires at least fi(?rlogn) — 0(n) = £2(?rlogn) time. 

2. An 0(n)-time reduction of sorting numbers to Dijkstra’s algorithm: To sort a list of 
n nonnegative numbers {aq, . . . , x n }, first create a star graph, with vertices {i>o, . . . , v n }, 
and with an edge (vo, u») weighted x*, for 1 < i < n. Next designate vq as the root vertex, 
and apply Dijkstra’s algorithm. Dijkstra’s algorithm proceeds according to increasing 
order of the edge weights, which yields the sorted order. This implies that Dijkstra’s 
algorithm requires at least U(nlogn) — Q(n) = fl(nlogn) time. 

3 . An 0(n)-time reduction of sorting numbers to a Voronoi diagram: To sort n num- 
bers {xi,...,x n }, create n points {(xj,0) | 1 < i < n} in the Euclidean plane. The 
Voronoi diagram consists of the n— 1 bisectors separating adjacent points (xj,0) on the 
line y = 0. Since the Voronoi diagram description includes ordering of Voronoi edges 
around each Voronoi vertex, the Voronoi diagram gives the ordering of the bisectors 
and hence the n numbers. This implies that the Voronoi diagram requires at least 
f2(nlogu) — 0(n) = U(nlogn) time. 


© 2000 by CRC Press LLC 



4. An 0{n)-time reduction of sorting numbers to Euclidean MST: To sort n numbers 
{xi, . . . ,x n }, create n points { (#*, 0) | 1 < i < n } in the Euclidean plane. The Eu- 
clidean MST of this set contains an edge between points (xj,0) and (x v 0) if and only 
if the numbers x, t and x 3 are consecutive in the sorted list of numbers. The Euclidean 
MST is easily converted back to a sorted list of numbers in 0{n) time. This implies that 
Euclidean MST requires at least fl(nlogn) — 0(n) = fi(nlogn) time. 


1 6.5 COMPLEXITY CLASSES 

From a formal viewpoint, complexity theory is concerned with classifying the difficulty 
of testing for membership in various languages. This means deciding whether any given 
string is in the language. The general application of complexity theory is achieved by 
encoding decision problems on natural topics such as graph coloring and finding integer 
solutions to equations as set membership problems. 


16.5.1 ORACLES AND THE POLYNOMIAL HIERARCHY 

Throughout this section, whenever the alphabet is unspecified, it may be assumed to 
be the binary set {0, 1}. Also, throughout this section a Turing machine is assumed to 
have among its states a unique acceptance state qA and a unique rejection state q r. All 
other states continue the computation. 

Definitions: 

A language over an alphabet is a set of strings on that alphabet (see §16.1.1). 

A nondeterministic Turing machine is a 5-tuple M = (K,s,h,H,A) otherwise 
like a deterministic Turing machine, except that the transition function A maps each 
state-symbol pair (q, b ) to a set of state-symbol-direction triples. 

An oracle for a language L is a special computational state to which a machine 
presents a string w, which switches to special state Y (“yes”) if w £ L and to special 
state N (“no”) if w /eL. 

An oracle Turing machine M is a 6-tuple M = ( K , s, h, E, S or A, L), equipped with 
an oracle for language L and with a special second tape on which it can write a string 
over the alphabet of language L (which might be different from E). Aside from oracle 
steps, it is a Turing machine. 

A Turing machine M accepts string w if there exists a computational path from the 
starting configuration with input w to the acceptance state qa- 

A Turing machine M rejects string w if it does not accept w. (Either M halts in 
rejection state qn or does not halt.) 

The language accepted by a Turing machine M is the set of all the strings it 
accepts. It is denoted C(M). 
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The Turing machine M decides the language C(M) if it always halts, even for input 
strings not in C(M). 

The time TimeM(w) taken by Turing machine M on input word w is the number of 
steps on the shortest accepting path if M accepts to, the number of steps on the longest 
rejecting path if M rejects w but always halts, and +oo otherwise. 

The space Spacejy[(w) taken by Turing machine M on input word w is the maximum 
number of different tape cells on which M writes during the computation, possibly +oo. 

The time complexity of a Turing machine M is the function t defined by t(n) = 
max { TimeM{x) \ \x\ = n }. 

A Turing machine M has polynomial time complexity if there exists a polynomial 
function p(n) such that { TimeM{x) < p(n) \ n = 0, 1, . . . }. 

A Turing machine M has polynomial space complexity if there exists a polynomial 
function p{n) such that { SpaceM{x) < p(n) \ n = 0, 1, . . . }. 

The complexity class P contains every language that can be decided by a deterministic 
TM with polynomial time complexity. 

The complexity class PSPACE contains every language that can be decided by a 
deterministic TM with polynomial space complexity. 

The complexity class NP contains every language that can be decided by a nonde- 
terministic TM with polynomial time complexity. 

For any language L, the complexity class P L contains every language that is decided 
in polynomial time by a deterministic TM with oracle L. 

For any language L, the complexity class NP L contains every language that is decided 
in polynomial time by a nondeterministic TM with oracle L. 

For any class C of languages, the complexity class P c contains every language that 
is decidable in polynomial time by a deterministic TM with oracle L £ C. 

For any class C of languages, the complexity class NP C contains every language that 
is decidable in polynomial time by a nondeterministic TM with oracle L £ C. 

The complexity class Eg is defined recursively: 

y«p _ / p p if A; = 0 
k }NP S '-1 if fc > 1 

The polynomial hierarchy PH is the collection comprising every language A for 
which there exists an n such that A £ £j'. 

The polynomial hierarchy is said to collapse (to the ith rank) if PH = E( J , for some 
i > 0. 

Complexity class coNP = n?. 

For n > 0, the complexity class nj( contains every language A such that A £ E£. 

For any class C, the complexity class P c = {L \ there is B £ C such that L <j, B}. 

Complexity class Aq = P. For n £ Z + , the class = P 2 "- 1 . 

The language A is sparse if there exists a polynomial p(n) such that for every n £ A f, 
there are at most p{n) elements of length n in A. 
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Facts: 


1. The following identity for complexity classes holds: 

£g = Hg = Ag = A? = P. 

2. For n > 0, the following relationships hold: 

AP C £P n W n | EP U HP C A£ +1 . 

3. The polynomial hierarchy PH is a subset of the complexity class PSPACE. 

4. If PH = PSPACE, then the polynomial hierarchy collapses. 

5. Downward separation : If £P = £^ +1 , then PH = £P. In particular, P = NP if and 
only if P = PH. 

6. Downward separation : If £P = nP, then PH = £P. 

7. Complexity class NP = £j\ 

8. The complexity class £P is closed under union and intersection, for all n > 0. 

9. p Np ncoNP _ NPncoNP. More generally, P s n nn £ = £P nn^ and P A " = Ajj, for all 
n > 0. 

10. Upward separation : Nondeterministic exponential time (U c> o NTIME[2 on ]) is 
equal to deterministic exponential time (U, >o DTIME[2 cn ]) if and only if NP — P con- 
tains no sparse sets. 

11. Succinct certificates: For every language in NP there is a proof scheme in which 
each member (and only members) has a polynomial-size “proof” of membership that 
can be checked in deterministic polynomial time. Such a short membership proof is 
sometimes called a succinct certificate. 


Examples: 

1. Logical proposition problems: The problem of deciding whether a particular as- 
signment of TRUE-FALSE values to the variables satisfies a logical proposition is in P. 
Deciding whether a proposition has an assignment that satisfies it is in NP. Deciding 
whether all assignments satisfy it (i.e., whether the proposition is a tautology) is in 
coNP. 

2. Graph isomorphism problems: Deciding whether a given vertex bijection between 
two graphs realizes a graph isomorphism is in class P. Deciding whether two graphs are 
isomorphic is in class NP. 

3. Graph coloring problems: Deciding whether an assignment of colors from a set of 
three colors to the vertices of a graph is a proper coloring is in P. Deciding whether a 
graph has a proper 3-coloring is in NP. 

4. Unique maximum clique problem: Define UMC to be the set of graphs G with a 
clique U C Vq such that every other clique is strictly smaller than U. Then UMC is in 
the class A?) = P NP . 

5. To prove by succinct certificate that a given graph has some clique of size at least k, 
one can provide a list of k adjacent vertices that are mutually adjacent. (The mutual 
adjacency condition for the k vertices can be verified in polynomial time.) 
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16.5.2 REDUCTIBILITY AND NP-COMPLETENESS 
Definitions: 

The language A over alphabet E is polynomial-time reducible (or m-p-reducible) 
to the language B , denoted if there exists a polynomial-time computable func- 

tion / such that x £ A if and only if f(x) £ B , for each x £ E*. 

The language A is NP-iiard if every language in class NP is polynomial-time reducible 
to A. 

The language A is NP-complete if A is NP-hard and A £ NP. 

The language A is C-hard if C is a class of languages that represent computational 
problems and every language in class C is polynomial-time reducible to A. 

The language A is C-complete if A is C-hard and A £ C. 

The language A is Turing-p-reducible to the language B, denoted A<L^B if there is 
a deterministic oracle TM M B that decides language A in polynomial time. 

The language A is C-Turing-p-hard if C is a class of languages that represent compu- 
tational problems and every language in class C is Turing-p-reducible to A. 

The language A is C-Turing-p-complete if A is C-Turing-p-hard and A £ C. 

Facts: 

1. For most NP-complete problems, showing membership in NP is easy. 

2. For integer linear programming, however, it is easy to show NP-hardness, but show- 
ing membership in NP is nontrivial. 

3. Polynomial-time reducibility is also called Karp reducibility after R. M. Karp. 

4. Turing-p-reducibility is also called Cook reducibility after S. A. Cook. 

5. The complement of any NP-complete problem is coNP-complete. 

6. If A is polynomial-time reducible to B, then A is Turing-p-reducible to B. 

7. liA<P m B and B C, then A C. 

8. If A < P T B and B < P T C, then A < p C. 

9. Downward closure : If A £ E(( and B A, then B £ E£, for every n > 1. In 
particular, if any NP-complete set is in P, then P = NP. 

10. Karp-Lipton theorem: If there is a sparse NP-<^-hard set, then PH = T, p . 

11. If there is a sparse NP-<((-complete set, then PH = A p . 

12. Mahaney’s theorem : If there is a sparse NP-hard (or NP-complete) set, then 
P = NP. 

13. Ladner’s theorem: If P ^ NP, then there exists a set in NP — P that is not 
NP-complete. 

14. A large catalog of NP-complete problems appears in [GaJo79]. A few of the most 
commonly cited appear in §16.4.1. 
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Examples: 

1. For examples of NP-complete problems, see §16.4.1 Example 3. 

2. Quantified Boolean formulas: Let QBF be the class of true statements of the form 

(3a:i) (Var 2 ) (3aj 3 ) (Vx 4 ) ... (Q z x z )[F(x 1 ,x 2 ,...,x z )], 
where F is a quantifier-free formula over the Boolean variables x\, . . . ,x z and where Qi 
is 3 if z is odd and V if 3 is even. Then QBF is PSPACE-complete. 

3. Tautologies problem : The classic coNP-complete language is the set TAUTOLOGY 
of all logical propositions that are satisfied by every assignment of logical values to its 
variables. 

4. Graph isomorphism problem: It is not known whether the set GI of isomorphic 
graph pairs is in coNP or whether GI is NP-complete, though it is known that GI is 
NP-complete only if the polynomial hierarchy collapses. 


16.5.3 PROBABILISTIC TURING MACHINES 
Definitions: 

A probabilistic Turing machine is a nondeterministic Turing machine M with ex- 
actly two choices at each step. Each such choice occurs with probability and is 
independent of all previous choices. 

The acceptance probability Pm(w) that a probabilistic Turing machine accepts input 
word w is the sum of the probabilities over all acceptance paths of computation. 

A probabilistic Turing machine M accepts language L with one-sided error if 

Pm{w) >5 if w G L 
Pm(w) = 0 if w /G L. 

A probabilistic Turing machine M accepts language L with two-sided error if 

Pm(w ) >| if w € L 
Pm{w) < 5 if w /gL. 

A probabilistic Turing machine M accepts language L with bounded two-sided 
error if for some e > 0 

Pm(w ) > \ + e if w G L 
Pm(w) < \ — e if w /gL. 

The complexity class RP of random polynomial-time languages is the class of 
languages that are decided by Turing machines with one-sided error in polynomial time. 

The complexity class coRP contains the language A if A G RP. 

The complexity class ZPP is the intersection RP fl coRP. 

The complexity class PP of probabilistic polynomial-time languages is decided 
by Turing machines with two-sided error in polynomial time. 

The complexity class BPP of bounded-error probabilistic polynomial-time 
languages is decided by Turing machines with bounded two-sided error in polynomial 
time. 
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Facts: 


1. ZPP is exactly the class of languages accepted by error- free probabilistic Turing 
machines running in expected polynomial time. 

2. ZPP = RP n coRP § c * p p § RP U coRP C BPP CPPC PSPACE. 

3. RP CNPC PP. 

4. P zpp = ZPP; P bpp = BPP. 

5. bpp c e'p n n p . 

6 . PH C P pp . 

7. PP is closed under all Boolean operations. 

8 . If NP C BPP then BPP = PH and RP = NP. 

9. PP is the class of languages L that for some Turing machine M running in polynomial 
time and given access to a fair two-sided coin has the property that, for each x, x £ L 
if and only if M accepts x with probability greater than 

10. It remains an open question whether BPP, RP, coRP, or ZPP have complete lan- 
guages. 


Examples: 

1. SAT £ PP: Consider a probabilistic polynomial-time Turing machine M that, 
given a proposition F, immediately flips its coin. If the result is “heads”, then propo- 
sition F is accepted and machine M halts. If “tails”, then the machine, via a series 
of coin flips, randomly assigns each variable to be either true or false, and ultimately 
accepts F if the resulting assignment satisfies the proposition. Thus, F is accepted with 
probability exactly \ if F is unsatisfiable, but is accepted with probability at least 5 + ^ 
if F is satisfiable, where k is the number of logical variables in F. Thus, SAT £ PP. 
This implies that NP C PP, since the language SAT is NP-complete. 

2. MAJORITY-SAT is PP-complete: The language MAJORITY-SAT is the set of 
(quantifier- free) Boolean formulas F such that F is satisfied by more than half of the 
possible variable assignments. 

3. PRIMES £ ZPP: The language PRIMES consists of the bitstrings that repre- 
sent prime numbers when interpreted as binary numerals. If the Extended Riemann 
Hypothesis holds, then PRIMES £ P. 

4. Equality of polynomial products: Given two lists of rational-coefficient polynomials, 
where each polynomial in the lists has been specified by a list of (coefficient, degree) pairs, 
the problem of deciding whether the product of the polynomials in the first list yields 
the same polynomial as the product of the polynomials in the second list is in the class 
coRP. 

Intuitively, this is because if the two products are equal, then they will evaluate to 
the same value on any argument, yet it can be argued that if an argument is chosen in an 
appropriate “random” fashion the products evaluated at that argument will probably 
differ if the product polynomials are not identical. 
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1 6.6 RANDOMIZED ALGORITHMS 


Some general randomization principles for algorithms have many specific applications. 
In particular, random algorithms from number theory have applications in cryptogra- 
phy and fingerprinting, Also, randomized algorithms for partitioning, for searching and 
sorting, and for graph problems such as mincut and matching, including some heuristics 
for NP-complete problems, have applications in testing and applications for parallel or 
distributed environments. 


16.6.1 OVERVIEW AND GENERAL PARADIGMS 

Most randomized algorithms follow a few general paradigms that guide the effective use 
of probabilistic strategies. For many further topics not covered here, see the excellent 
survey papers [Ka91] and [We83] and also the textbook [MoRa95]. 

Definition: 

A randomized algorithm is an algorithm that makes random choices during its exe- 
cution. Such random choices can be guided by the output of a random (or, in practice, 
pseudo-random) number generator. 

Facts: 

1. Intuitively, the power of randomization is analogous to the standard game-theoretic 
fact that probabilistic game strategies are substantially more effective than deterministic 
ones. 

2 . In the game-theoretic analogy, an algorithm can be regarded as a player, and the 
problem to be solved can be regarded as an adversary trying to present the player with 
input instances on which the algorithm exhibits worst case performance. 

3 . If an algorithm is deterministic, then the game-theoretic adversary knows in advance 
the entire strategy of the player. Thus, the worst case instances are well defined and 
can be presented as input to the algorithm. 

4 . If an algorithm is probabilistic, then the game-theoretic adversary does not know 
in advance the output of the random number generator that guides part of the algo- 
rithm’s choices. In particular, worst case instances under deterministic strategies may 
be smoothed out by randomization. 

5 . Worst case instances of a randomized algorithm occur when the algorithm performs 
badly for the overwhelming majority of its probabilistic choices. 

6. Many problems have no known deterministic algorithms to match the efficiency of 
randomized algorithms. Even for problems for which efficient deterministic algorithms 
are known, randomized algorithms are often remarkably easier to understand and im- 
plement . 

7 . Abundance of witnesses paradigm: Deciding whether a given input has a certain 
property sometimes reduces to finding a combinatorial object “witnessing” the property. 
When the space of all potential witnesses is too large to be searched exhaustively, it 
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sometimes suffices to inspect a small random sample, selected so that one of the elements 
of the sample will be a suitable witness with very high probability. 

8. Random sampling: Sometimes a small random sample is indicative of the popula- 
tion as a whole. 

Examples: 

1. Cryptography: Most of public- key cryptography is based on the sharp dichotomy 
between the efficiency of deciding whether a number is prime or composite and the 
apparent hardness of actually factoring composite numbers. 

2. Fingerprinting: A large data object is represented by a much smaller “fingerprint” 
such that, with very high probability, distinct objects map to distinct fingerprints. A 
similar strategy is used for “hashing” (§17.4) where large objects are mapped to much 
smaller keys with very low probabilities of collisions. 

3. Testing identities: It is often possible to check if an algebraic expression is identically 
equal to zero by substituting random values for the variables and checking whether the 
expression evaluates to zero. 

4. Symmetry breaking: It is often necessary for a set of distributed or parallel processes 
to come collectively to an arbitrary but consistent decision among a set of indistinguish- 
able possibilities. There is a method to break such symmetries using randomization and 
for an indication that gives an efficient parallel perfect matching algorithm, as well as 
applications to many protocols for distributed environments, to computation in the 
presence of errors, and to Byzantine agreements. 

5. Load balancing: For problems involving choice between a number of resources, such 
as processors or communication links in parallel or distributed networks, randomization 
can be useful in spreading out the load. 

6. The probabilistic method: The probabilistic method is to demonstrate that a combi- 
natorial object of interest occurs with non-zero probability in a suitably defined probabil- 
ity space. Sometimes the probabilistic method yields efficient algorithmic constructions 
rather than mere existential arguments. 


16.6.2 LAS VEGAS AND MONTE CARLO ALGORITHMS 

Randomized algorithms are classified into two types — Monte Carlo algorithms and Las 
Vegas algorithms. 

Definitions: 

A Monte Carlo algorithm has bounded running time and produces correct output 
with probability bounded away from zero. 

The success amplification method for a Monte Carlo algorithm is to perform k 
independent runs of the algorithm. 

A Las Vegas algorithm always produces correct output. However, its running time is 
a random variable, whose expectation and variance must be quantified in the analysis 
of the algorithm. 
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The success amplification method for a Las Vegas algorithm is to perform | 
independent Las Vegas runs of 2 E[T] steps each. 

The Monte Carlo to Las Vegas transformation , starting from a Monte Carlo 
algorithm, is the Las Vegas algorithm of repeatedly running that Monte Carlo algorithm 
until a success occurs. 

The Las Vegas to Monte Carlo transformation, starting from a Las Vegas algo- 
rithm, is the Monte Carlo algorithm obtained by running the Las Vegas scheme for 
kE[T ] steps and halting, where E[T\ is the expected Las Vegas running time. 


Facts: 

1 . If the probability of success of a single run is p, then the probability under the success 
amplification method that k independent runs fail is (1 — p) k . Thus, the probability of 
success becomes 1 — (1 — p) k . 

2. If p is the probability of success of a Monte Carlo algorithm, then the expected 
number of Las Vegas trials before a success occurs is 

P + 2(1 - p)p + 3(1 - p) 2 p + 4(1 - p) 3 p H = i. 

3. Markov’s inequality : The probability that a positive random variable exceeds k 
times its expectation is at most \ . 

4. Markov’s inequality yields a general method to bound variances of Las Vegas algo- 
rithms. If T is the running time of a Las Vegas algorithm, then 

Pr [T>kE[T]] < i. 

5. The probability that a transformed Las Vegas to Monte Carlo algorithm is successful 
is at least 1 — ^ . 

6. If the expected running time of a Las Vegas algorithm is E{T), then the running 
time of the amplified algorithm is kE[T\. However, the probability of success becomes 
1 — (|) 2 , since each 2 E[T\ run has probability of failure at most Thus, the probability 
that | independent runs fail is at most (\)^ ■ 

Examples: 

1. A database problem : In a large database whose keys are stored in no particular 
order, find a key that is not contained in that database, within time O(N), where N is 
the size of the database. (Assume N = 2 3 0 and that the keys are 32 digits long.) This 
would match the natural lower bound of the time required just to read the entire 

database. The deterministic strategy of sorting and checking for the first missing key 
would take time O(NlogN), where N = 2 30 is the size of the database. 

• a Monte Carlo randomized strategy: Pick a random 32-digit key and then scan 

the database! There are 2 32 potential 32-digit keys and only || = 2 25 keys in the 
database, a fraction of = 2 -7 . Thus, the probability that a randomly chosen key 
is not in the database is at least 1 — 2~ 7 , which is greater than 99%. The running 
time is dominated by a single scan of the database to check whether the randomly 
chosen key is suitable. Thus, it completes in O(N). 

• success amplification: The probability that among k independently chosen random 

keys none are found suitable is less than 0.01 fc = 10 -2fe ; this quantity becomes neg- 
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Algorithm 1: In a set A of n distinct keys, find the mth smallest. 

input: a set A with n = |A|, and an integer to with 1 < m < n 

FIND (A, to) 

if A = {s} then return s 

else 

pick s uniformly at random from A 
compute X = {a£ J 4|a<s} 
compute Y = {a£ A|o>s} 
if \X\ > m then call FIND(X, m) 
if n — \Y\ < m then call FIND(Y, to — (n — |Y|)) 
if |X| < m < n — \Y\ then return s 
end 


ligible, even for very small values of k. The running time of this amplified algorithm 
is O(kN). 

• from Monte Carlo to Las Vegas: Repeatedly pick random keys until a suitable key 

is found. The expected number of trials before a suitable key is found is ^ 

Thus, the expected running time is 0( 10 g g N ) = 0(N ). 

2. A modified database problem: Among the n = keys of the database of Example 1, 
find the mth in increasing lexicographic order, within time 0(n). (Algorithm 1 does 
this.) The deterministic strategy of sorting would take time O(nlogn). 

• a Las Vegas randomized strategy: Pick a random key s from the database and consider 

the sets X and Y of keys in the database that are smaller and larger, respectively, 
than s. If |X| > m, then the problem reduces to finding the ?nth key in X. If 
n — |Y| < m, then the problem reduces to finding the (to— (n— |Y|))th key in Y. 
Finally, if X < to < n — |Y|, then s is the mth key. 

• expected running time: The randomly chosen key s splits the database into pieces 

X and Y which are, on average, of size (band in most cases substantially smaller 
than n. Thus, the problem of looking for a key in a set of size n reduces to a problem 
of looking for a key in a set of size “approximately” ^ and a running time of the 
type 

T{n) w T{ f ) + 0{n) = 0{n) 

can intuitively be expected. More precisely, let T(n,m ) denote the running time 
to find the mth key. Since any of the keys could equally likely be picked as the 
splitter s, the expectation E[T(n, to)] satisfies this recurrence: 

E[T(n,m)\ = 

i£[T(n— l,m— 1)] + ±E[T(n-2,m-2)] + • • • + ±E[T(n-(m- 1),1)] 

+^E[T(m+l,m)\ + ^E[T(m+2,m)\ H 1- ^E[T(n—l,m)] + cn, 

for some constant c. The solution is E[T(n,m)\ = O(n), for all m. 

• variance: Markov’s inequality bounds the variance of the running time by 

Pr[ T(n, to) > kE[T(n, to)] ] < 
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Algorithm 2: Test primality of n with k witnesses. 

PRIMALITY TEST(n, k) 

input: positive integers n and k with n odd and k > 2 
if n is odd then 

pick ai , ,a,k, each a.; independently and uniformly at random from [1, n— 1] 
compute gcd(n, a,) for all 1 < i < k 

{gcd(n, Oj ) can be computed efficiently using Euclid’s algorithm} 
if there exists an ai with gcd(n, ai) ^ 1 then output “ composite ” and halt 

re — 1 

compute a i 2 (mod n) for all a,; with 1 < i < k 

re — 1 

{a i 2 (mod n) can be computed efficiently by repeated squaring} 

n — 1 

if for some a.j, a. t 2 /= ± (mod n) then output “ composite ” and halt 

n—1 

if for some a.i, a i 2 = — 1 (mod n) then output “ prime, with high confidence" 
and halt 

n — 1 

if for all ai, ai 2 =1 (mod n ) then output “ composite , with high confidence ” 
and halt 


3. Algorithm 2, Primality Test, produces correct output with probability at least 
1 — (^) fe . After logn trials of selecting a random integer less than n and testing, the 
likelihood is very high for reasonably large n, that a prime number will be obtained. 
This follows from the prime number theorem and Markov’s inequality. 
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INTRODUCTION 


Information structures are groupings of related information into records and organiza- 
tion of the records into databases. The mathematical structure of a record is specified 
as an abstract datatype and represented concretely as a linkage of segments of computer 
memory. General chapter references are [AhHoU183], [Kn68], and [Kn73]. 


GLOSSARY 

abstract datatype (. ADT ): a mathematically specified datatype equipped with op- 
erations that can be performed on its data objects. 

adaptive bubblesort : a bubblesort that stops the first time a scan produces no trans- 
positions. 

ADT- construct or : any of the three operations string of, set of, or tuple of used to 
build more complex ADTs from simpler ADTs. 

alphabetic datatype : an elementary datatype whose domain is a finite set of symbols, 
and whose only primary operation is a total ordering query. 

ambivalent data structure : a structure that keeps track of several alternatives at 
many of its vertices, even though a global examination of the structure would deter- 
mine which of these alternatives is optimal. 

array data structure: an indexed sequence of cells (a,- \ j = d, ... ,u) of fixed size, 
with consecutive indices. 

AVL tree: a binary search tree with the property that the two subtrees of each node 
differ by at most one in height. 

binary search: a recursive search method that proceeds by comparing the target key 
to the key in the middle of the list, in order to determine which half of the list could 
contain the target item, if it is present. 

binary search tree: a binary tree in which the key at each node is larger than all the 
keys in its left subtree, but smaller than all the keys in its right subtree. 

binary-search-tree structure: a binary-tree structure in which for every cell, all cells 
accessible through the left child have lower keys, and all cells accessible through the 
right child have higher keys. 

binary-tree structure: a tree structure such that each cell has two pointers. 

bubblesort : a sort that repeatedly scans an array from the highest index to the low- 
est, on each iteration swapping every out-of-order pair of consecutive items that is 
encountered. 

cell (in a concrete data structure): a storage unit within the data structure that may 
contain data and pointers to other cells. 

certificate (for a property of a graph G): another graph that has the specified property 
if and only if the graph G has the property. 

chaining method (for hash tables): a hashing method that resolves collisions by 
placing all the records whose keys map to the same location in the main array into 
a linked list (chain), which is rooted at that location, but stored in the secondary 
array. 


© 2000 by CRC Press LLC 



circular linked list: a set of cells, each with two pointers, one designated as its 
forward pointer and the other as its backward pointer , plus a header with one or 
more pointers to current cells, such that these conditions hold: 

• the sequence of cells formed by following the forward pointers, starting from any 

cell, traverses the entire set and returns to the starting cell; 

• the sequence of cells formed by following the backward pointers, starting from 

any cell, traverses the entire set and returns to the starting cell. 

closed hash table: a hash table in which collisions are resolved without the use of 
secondary storage space, that is, by probing in the main array to find available 
locations. 

cluster (in a spanning tree): a set of vertices whose induced subgraph is connected. 

clustering property (of a probe function): the undesirable possibility that parts of 
the probe sequences generated for two different keys are identical. 

collision instance (of a hash function) : a pair of different keys for which the value of 
the hash function is the same. 

collision resolution (of a hashing process): a procedure within the hashing process 
used to define a sequence of alternative locations for storage of a record whose key 
collides with the key of an existing record in the table. 

comparison sort: a sorting method in which the final sorted order is based solely on 
comparisons between elements in the input sequence. 

concrete data structure: a mathematical model for storing the current value of a 
structured variable in computer memory. 

database: a set of records, stored in a computer. 

datatype: a set of objects, called the domain, and a set of mappings, called primary 
operations, from the domain to itself or to the domain of some other datatype. 

deheaping : removing the highest priority entry from a heap and patching the result 
so that the heap properties are restored. 

dictionary: an abstract datatype whose domain is a set of keyed pairs, in which 
arbitrary pairs may be accessed directly. 

domain (of a datatype): the set of objects within that datatype. 

dyadic graph property: a property defined with respect to pairs of vertices. 

dynamic structure (for a database): an information structure for the database whose 
configuration may be changed, for instance, by the insertion or deletion of elements. 

dynamic update operation: (on a graph) an operation that changes the graph and 
keeps track of whether the graph has some designated property. 

edge-incidence table (for a graph): a dictionary whose keys are the vertices of a 
graph or digraph. The data component for each key vertex is a list of all the edges 
that are incident on that vertex. Each self-loop occurs twice in the list. 

elementary datatype: an alphabetic datatype or a numeric datatype, usually in- 
tended for direct representation in the hardware of a computer. 

endpoint table (for a graph): a dictionary whose keys are the edges. The data 
component for each key edge is the set of endpoints for that edge. If an edge is 
directed, then its endpoints are marked as head and tail. 

enheaping: placing a new entry into its correctly prioritized position in a heap. 
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entry (in a database): a 2-tuple, whose first component is a key, and whose second 
component is some data; also called a record. 

external sorting method : a method that uses external storage, such as hard disk or 
tape, outside the main memory during the sorting process. 

far end (of a one-way linked list): the cell that contains a null pointer. 

Fibonacci heap : a modification of a heap, using the Fibonacci sequence, that per- 
mits more efficient implementation of a priority queue than a heap based on a left- 
complete binary tree. 

FIFO property (of a database): the property that the item retrieved is always the 
item inserted the longest ago. FIFO means “first-in-first-out”. 

hat notation (in a postcondition of a primary operation specification): the value X b 
of the variable X before the specified operation is executed. 

fullness (of a closed hash table): the ratio of the number of records presently in the 
table to the size of the table. 

generic datatype : a specification in an ADT-template that means that there are no 
restrictions whatsoever on that datatype. 

hash function (for storing records in a table): a function that maps each key to a 
location in the table. 

hash table: an array of locations for records (entries) in which each record is identified 
by a unique key, and in which a hash function is used to perform the table-access op- 
erations (of insertion, deletion, and search), possibly involving the use of a secondary 
array to resolve competition for locations. 

hashing: storage-retrieval in a large table in which the table location is computed 
from the key of each data entry. 

header (of a concrete data structure): a special memory unit (but not a cell) that 
contains current information about the entire configuration and pointers to some 
critical cells in the structure. 

heap: a concrete data structure that represents a priority tree as an array. 

heapsort: sorting a set of entries by first enheaping them all and then deheaping them 
all. 

incidence matrix (for a graph): a 0-1 matrix that specifies the incidence relation. 
The rows are indexed by the vertices and the columns by the edges. The entry in 
the row corresponding to vertex v and edge e is 1 if v is an endpoint of e, and 0 
otherwise. 

in-place realization (of a sorting method): a method that uses, beyond the space 
needed for one copy of each data entry, only a constant amount of additional space, 
regardless of the size of the list to be sorted. 

insertion sort: a sort that transforms an unsorted list into a sorted list by iteratively 
transferring the next item from the remaining items in the unsorted input list and 
inserting it into correct position in the sorted output list. 

internal sorting method: any method that keeps all the entries in the primary mem- 
ory of the computer during the process of rearrangement. 

key (in a database entry): a value from an ordered set, used to store and retrieve data. 

key domain: the ordered set from which values of keys are drawn. 
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key entry (of a record in a table): a value from an ordered set (e.g., integer identifi- 
cation codes or alphabetic strings) used to store records in the table. 

key randomization : a “preliminary” procedure within the hashing process for map- 
ping non-numeric keys (or keys with poor distribution) into (more uniformly) random 
distributed integers in the domain of the hash function. 

keyed pair: a 2-tuple whose first component, called a key, is used to locate the data 
in the second component. 

left-child (of a cell in a binary tree structure): the cell to which the first pointer points. 

left-complete binary tree: either a binary tree that is complete, or a balanced binary 
tree (§9.1.2) such that at depth one less than the maximum, the following hold: 

• all nodes with two children are to the left of all nodes with one or no children; 

• all nodes with no children are to the right of all nodes with one or two children; 

• there is at most one node with only one child, which must be a left-child. 

LIFO property (of a database): the property that the item retrieved is always the 
item most recently inserted. LIFO means “last-in-first-out”. 

linear search: the technique of scanning the entries of a list in sequence, until either 
some stopping condition occurs or the entire list has been scanned. 

mergesort: a sort that partitions an unsorted list into lists of length one and then 
iteratively merges them until a single sorted list is obtained. 

near end (of a one-way linked list): the cell that is pointed to by the header and by 
no other cell. 

nearly complete (property of a binary tree): the possible property that the binary 
tree is complete at every level except possibly at the bottom level. At the bottom 
level, all the missing leaves are to the right of all the present leaves. 

null pointer: a pointer that points to an artificial location, which serves as a signal to 
an algorithm to react somewhat differently than to a pointer to an actual location. 

numeric datatype: an elementary datatype whose domain is a set of numbers and 
whose primary operations are a total ordering query and the arithmetic operators + 
(addition), x (multiplication), and — (change of sign). 

one-way linked list: a set of cells, each with one pointer, such that: 

• exactly one of these cells is pointed to by the header but by no cell; 

• exactly one cell contains a null pointer; 

• the sequence of cells formed by following the pointers, starting from the header, 

traverses the entire set, ending with the cell containing the null pointer. 

open hash table: a hash table that uses a secondary array to resolve collisions. 

ordered datatype: a datatype with an order relation such that any two elements can 
be compared. 

pivot (in a quicksort): an entry at which the sequence is split. 

plane graph: a planar graph, together with a particular imbedding in the plane. 

pointer (to a cell): a representation of that cell’s location in computer memory. 

postcondition (of a primary operation): a list of conditions that must hold after the 
operation is executed, if the precondition is satisfied when the operation commences. 

precondition (of a primary operation): a list of conditions that must hold immediately 
before the operation is executed, for the operation to execute as specified. 
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primary key : the key component of highest precedence, when the key has more than 
one component. 

primary operation (for a datatype): a basic operation that retrieves information 
from an object in the domain or modifies the object. 

priority queue : an abstract datatype whose domain is a set of records, in which only 
the entry with the largest key is immediately accessible. 

priority tree-, a nearly complete binary tree whose nodes are assigned data entries 
from an ordered set of “priorities” , such that there is no node whose priority super- 
sedes the priority of its parent node. 

probe function: a function used iteratively to calculate an alternative location in a 
closed hash table when the initial location calculated from the key or the previous 
probe location is already occupied. 

probe sequence (for a hash table location): the sequence of locations calculated by 
the probe function in its effort to find an unoccupied place in the table. 

query (to a datatype): a primary or secondary operation that changes nothing and 
returns a logical value, i.e. , true or false. 

queue: an abstract datatype that organizes the records into a sequence, such that 
records are inserted at one end (called the back of the queue) and extractions are 
made from the other end (called the front). 

quicksort: sorting by recursively partitioning a list around an entry called the pivot 
so that all smaller items precede the pivot and all larger items follow it. 

radix sort: a sort using iterative partitioning into queues and recombining by con- 
catenation, in which the partitioning is based on a digit in a numeral. 

random access list: an abstract datatype whose domain is a set of records such that 
the values of the key field range within an interval of integers a < k < b; this permits 
implementations that execute primary operations faster than a general table. 

rank (of an element of a finite ordered set): the number of elements that it exceeds or 
equals. 

rank-counting sort: sorting by calculating the “rank” for each element, and then 
assigning each element to its correct position according to its rank. 

record: a 2-tuple, whose first component is a key , and whose second component is 
some data; also called an entry. 

record in a table: a table entry containing a key and some data. 

right-child (of a cell in a binary-tree structure): the cell to which the second pointer 
points. 

root cell (of a tree structure): the cell to which the header points. 

scanning a database (or a portion of a database): examining every record in that 
database or portion. 

searching (a database): seeking either a target entry with a specific key or a target 
entry whose key has some specified property. 

secondary key: the key component of next highest precedence, when the key has 
more than one component. 

secondary operation (for a datatype): an operation constructed from primary oper- 
ations and from previously defined secondary operations. 
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selection sort: a sort that transforms an unsorted list into a sorted list by iteratively 
finding the item with smallest key from the remaining items in the unsorted input 
list and appending it to the end of the sorted output list. 

sequence of : an ADT-constructor that converts a datatype with domain D into a new 
datatype, whose domain is the set of all finite sequences of objects from domain D, 
and whose primary operations are some sequence operations. 

set of: an ADT-constructor that converts a datatype with domain D into a new 
datatype, whose domain is the set of all subsets of objects from domain D , and 
whose primary operations are set operations. 

shakersort: a bubblesort variation that alternates between bubbling upward and sink- 
ing downward on alternate scans. 

Shellsort: a sorting method that involves partitioning a list into sublists and insertion 
sorting each of the sublists. 

sinking sort: a “reverse bubblesort” that scans an array repeatedly from the lowest 
index to the highest, each time swapping every out-of-order pair of consecutive items 
that is encountered. 

size (of a cell in a data structure): the number of bytes of computer memory that the 
cell occupies. 

size (of a hash table): the number of locations in the main array in which the records 
are stored. (If chaining is used to resolve collision, the total number of records stored 
may exceed the size of the main array.) 

sorting algorithm: a method for arranging the entries of a database into a sequence 
that conforms to the order of their keys. 

sparse certificate: a strong certificate (for a property of a graph G) in which the 
number of edges is 0{\Vg\)- 

sparse sequence: a sequence in which nearly all the entries are zeros. 

stable certificate: a certificate produced by a stable function. 

stable (certificate) function: a function that maps graphs to strong certificates such 
that: 

• A(GUH) = A(A(G)UH)- 

• A(G — e) differs from A(G) by 0(1) edges, where e is an edge in G. 

stack: an abstract datatype that organizes the records into a sequence, in which in- 
sertion and extraction are made at the same end (called the top of the stack) . 

static structure (for a database): an information structure for the database whose 
configuration does not change during an algorithmic process. 

strong certificate (for a property of a graph G): a certificate graph G' for G with the 
same vertex set as G such that, for every graph H , the graph GU H has property V 
if and only if G' U H has property V . 

table: a set of keyed pairs, in which arbitrary pairs may be accessed directly; used as 
the domain of a dictionary. 

target (of a database search): an entry whose key has been designated as the objective 
of the search. 

2-3 tree: a tree in which each non-leaf node has 2 or 3 children, and in which every 
path from the root to a leaf is of the same length. 
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tree structure: a concrete data structure such that the header points to a single cell, 
and such that from that cell to each other cell, there is a single chain of pointers. 

k-tuple of : an ADT-constructor that converts a list of k datatypes into a new data- 
type, whose domain is the cartesian product of the domains of the datatypes in that 
list, and whose primary operations are the projection functions from a fc-tuple to 
each of its coordinates. 

two-way incidence structure (for a graph): a pair consisting of an edge-incidence 
table and an endpoint table. 

two-way linked list: a set of cells, each with two pointers, one designated as its 
forward pointer and the other as its backward pointer, plus a header with a forward 
pointer and a backward pointer, such that these conditions hold: 

• considering only the forward pointers, it is a one-way linked list; 

• following the sequence of backward pointers yields the reverse of the sequence 

obtained by following the forward pointers. 

two-way sequential list: an ADT-template whose domain is strings, in which an 
entry is reached by applying the access operations forward and backward. Insertions 
are made before or after the current location. 

union-find datatype: an abstract datatype whose records are mutually disjoint sets, 
in which there is a primary operation to locate the set containing a specified target 
element and a primary operation of merging two sets. 


1 7.1 ABSTRACT DATATYPES 

Organizing numbers and symbols into various kinds of records is a principal activity of 
information engineering. The organizational structure of a record is called a datatype. 
Abstractly, a datatype is characterized by a formal description of its domain and of the 
intrinsic operations by which information is entered, modified, and retrieved. Providing 
the specification at this abstract level ensures that the datatype is independent of the 
underlying types of information elements stored within the structure, and independent 
also of the hardware and software used to implement this organization. [AhHoU183], 
[Kn68] 


17.1.1 ABSTRACT SPECIFICATION OF RECORDS AND DATABASES 

Information engineering uses discrete mathematics as a source of models for various 
kinds of records and databases. The language of abstract mathematics is used to specify 
a complex structure in terms of its elements. Constructors and templates are used to 
create new kinds of data from old kinds. 

Definitions: 

A datatype consists of a set of objects, called the domain , and a set of mappings, called 
primary operations, from the domain to itself or to the domain of some other datatype. 

The domain of a datatype is its set of objects. 

A primary operation for a datatype is a basic operation that retrieves information 
from an object in the domain or modifies the object. 
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A secondary operation on the domain of a datatype is an operation constructed from 
primary operations and previously defined secondary operations. 

A query is a primary or secondary operation on a datatype domain that preserves the 
values of all its arguments and returns a logical value, i.e. , true or false. 

An alphabetic datatype is a datatype whose domain is a finite set of symbols. Its 
only primary operation is a total ordering query. 

A numeric datatype is a datatype whose domain is a set of numbers and whose 
primary operations are a total ordering query and the arithmetic operators + (addition), 
x (multiplication), and — (change of sign). 

An elementary datatype is an alphabetic datatype or a numeric datatype, usually 
intended for direct representation in the hardware of a computer. 

An abstract datatype ( ADT ) is a mathematically specified datatype equipped with 
operations that can be performed on its data objects. 

An ADT-constructor is a template that converts a datatype into a new ADT. 

The constructor sequence of transforms a datatype X-type with domain D into a new 
datatype “sequence of X-type” whose domain is the set Seqo of all finite sequences of 
elements of D. The primary operations of the resulting datatype are: 

• header(s), which yields a singleton sequence whose only element is the first 

object in the sequence s (or the empty sequence, if the sequence s is empty); 

• trailer(s), which deletes the first entry of sequence s (or yields the empty se- 

quence, if the sequence s is empty); 

• concat(s,t), which concatenates the two sequences; 

• first(s), which gives the value of the first entry of a non-empty sequence s; 

• append(s, d), which appends to sequence s € Seqn an entry d £ D; 

• nullseq( ), whose value is the null sequence A. 

The constructor set of converts a datatype X-type with domain D into a new datatype 
“set of X-type” whose domain is the set of all subsets of D. The primary operations 
are: 

• inclusion(S,T), a query whose value is true if S' C T; 

• union(S, T ), whose value is S U T\ 

• intersection(S, T), whose value is S D T\ 

• difference's, T ), whose value is S — T; 

• choose(S), whose value is an arbitrary element of a nonempty subset S; 

• singleton(d), which transforms an element of D into the singleton set whose 

only entry is d\ 

• emptyset( ), whose value is the emptyset 0; 

• universe( ), whose value is the underlying domain D. 

The constructor k-tuple of converts a list of k datatypes 

Xi-type, X 2 -type, . . . , X fc -type 

into a new datatype “fc-tuple (Xi, . . . , X^)” whose domain is the cartesian product 
D\ x Z ?2 x • • • x Dk of the domains of the respective datatypes in that list. The primary 
operations are the projection functions: 

• coordj(s), which gives the value of the jth coordinate of the fc-tuple s; 

• entuple {d\, . . . , dk), whose value is the fc-tuple whose jth coordinate is the 

element dj of domain Dj . 
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An elementary ADT-constructor is any of the three operations sequence of, set of, 
or tuple of used to build more complex ADTs from simpler ADTs. 

The Iverson truth function assigns to a proposition p the integer value (p) such that 

(p) = I 1 ifp is true; 

1 0 otherwise. 

A datatype specification uses a combination of elementary datatypes and ADT- 
constructors to specify the domain and the primary operations. It may also use the 
following mathematical notation: 

• 0 denotes the empty set; 

• A denotes the empty sequence; 

• • denotes the operation of appending one element to a sequence; 

• o denotes the sequence concatenation operation; 

Moreover, every primary operation is either a query or a procedure. 

Specifying a datatype as generic in an ADT-template means that any datatype can be 
used in that part of the template as a block in the construction of the new datatype. 

Specifying a datatype as ordered in an ADT-template means that any ordered datatype 
can be used in that part of the template as a block in the construction of the new 
datatype. 

The precondition of a primary operation is a list of conditions that must hold imme- 
diately before the operation is executed, for it to execute as described. 

The postcondition of a primary operation is a specification of conditions that must 
hold after the operation is executed, if the precondition is satisfied when the operation 
commences. 

The Eat notation X b in a postcondition of a primary operation specification means 
the value of the variable X before that operation is executed. Unadorned X (without 
the b) means the value of X after the operation. 


Facts: 

1. The domain of an ADT is specified as a mathematical model, without saying how 
its elements are to be represented. 

2. Sometimes the domain of a datatype is specified by roster. Other times it is specified 
with the use of set-theoretic operations. 

3. A primary operation of an ADT is specified functionally. That is, its value on every 
element of the domain is declared, but the choice of an algorithm to be used in its 
implementation is omitted. 

4. A primary operation may be implemented so that it has direct access to the data 
representing the value of the computational variable to which it is applied. 

5. A primary operation can modify the information within a variable in its datatype 
or retrieve information from a variable. 

6. A secondary function is implemented through calls to the primary operations from 
which it is ultimately composed. 

7. There is no set of standard conventions for writing ADTs. 
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8. Software designers frequently specify a particular concrete information structure 
(see §17.2), instead of writing an ADT. 

9. The advantage of writing an ADT, rather than a concrete datatype, is that it leaves 
the implementer room to find a new (and possibly improved) way to meet the require- 
ments of the task. 

10 . In a datatype specification, a functional subprogram is represented by a procedure 
that produces a non-boolean value, and a variable to receive that value is specified as 
its last parameter. 

Examples: 

1. Elementary numeric datatypes include the integers and the reals. 

2. Elementary alphabetic datatypes include the ASCII set and the decimal digits. 

3. Complex jnumber is a datatype that represents complex numbers and their addition 
and multiplication. 

ADT complex_number: 

Domain 

2-tuple (re: real, im : real) 

Primary Operations 

sum (w: complex number, z: complex_number) 

Comment: add two complex numbers. 

{pre: none} 

{post: sum(w , z ) = entuple[re{w ) + re(z), im(w ) + im(z ))} 
prod (w: complex_number, z: complex_number) 

Comment: multiply two complex numbers. 

{pre: none} 

{post: prod(w, z) = entuple 

(re(w) ■ re(z) — im(w ) • im(z), re(w) ■ im{z) + im(w) ■ re©))} 

4. Basetemdigit is a datatype that might be used in the construction of base-ten nu- 
merals representing arbitrarily large integers and their addition. 

ADT baseten digit: 

Domain 

{0, 1, 2, 3,4, 5, 6, 7, 8, 9}: integers 
Primary Operations 

add_digits (x: baseten.digit, y : baseten_digit) 

{pre: none} 

{post: cidd_digits(x, y) — x + y mod 10} 
addcarry: (x: baseten_digit, y: baseten_digit) 

{pre: none} 

{post : addcarry (x, y) = (x + y > 10)} 

5. The datatype alphastring represents sequences of lowercase English letters. 

ADT alphastring : 

Domain 

sequence of {a, 6, c, d, e, /, g , h, i,j,k,l, m, n, o, p , q, r, s, t,u,v, w, x, y, z} 

Primary Operations 

none except from the constructor sequence of 
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6. The union-find constructor transforms a datatype on domain set S into a datatype 
whose objects are of three kinds: elements of S , subsets of S, and partitions of S. There 
is a primary operation to merge two cells of a partition, and a primary operation to 
locate the cell of a partition that contains a specified target element. 


17.1.2 STACKS AND QUEUES 

Access to entries in the interior of a list is unnecessary much of the time. Restricting 
access to the first and last entries is a precaution to prevent mistakes. 

Definitions: 

A stack is an ADT whose domain is a sequence, one end of which is called the top, 
and the other the bottom. One primary operation, called pushing appends a new entry 
to the top, and the other, called popping removes the entry at the top and returns its 
value. No entry may be examined, added to the stack, or deleted from the stack except 
by an iterated composition of these operations. 

The top of a stack is the end of that stack that can be accessed directly. 

Pushing an entry onto a stack means appending it to the top of the stack. 

Popping an entry from a stack means deleting it from the top of the stack and 
possibly examining the data it contains. 

The LIFO property of a database is that the item retrieved is always the item most 
recently inserted. LIFO means “last-in- first-out” , 

A queue is an ADT whose domain is a sequence, one end of which is called the front, 
and the other the back. One primary operation, called enqueueing appends a new entry 
to the back, and the other, called dequeueing removes the entry at the front and returns 
its value. No entry may be examined, added to the queue, or deleted from the queue 
except by an iterated composition of these operations. 

The back of a queue is the end to which entries may be appended. 

The front of a queue is the end from which entries may be deleted and possibly 
examined. 

Enqueueing an entry into a queue means appending it to the back of the queue. 

Dequeueing an entry from a queue means deleting it from the front of the queue, 
and possibly examining the data it contains. 

The FIFO property of a database is that the item retrieved is always the item inserted 
the longest ago. FIFO means “first-in-first-out”. 

Facts: 

1. Abstract specification of stacks and queues mention only the behavior of those 
datatypes, and totally avoid all details of implementation. This permits a skillful imple- 
menter to innovate with efficient concrete structures (see §17.2) that meet the behavioral 
specification. 
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2 . Abstract specification of stacks and queues is consistent with the principles of object- 
oriented programming, in which details of implementation are hidden inside the data 
objects, so that the rest of the program perceives only the specified functional behavior. 

3. All stacks have the LIFO property. For any stack S and for any element b , after 
executing the sequence of instructions 

push(5, b), pop(S', x) 

the resulting value of the stack S is whatever it was before the operations. 

4. A stack is most commonly implemented as a linked list (see §17.2.2). 

5. After changing the value of the variable y to that of the top entry on the stack S, 
the sequence of instructions 

pop(5', x), y := x , push©) 
restores S to its previous state. 

6. All queues have the FIFO property. Given an empty queue Q and two elements b\ 
and 62, the sequence of operations 

enqueue(Q, &i), enqueue (Q, 62), dequeue^, xi), dequeue (Q, X 2 ), 
yields x\ = 61, X 2 = 62, and Q = A. 

7. A queue is most commonly implemented as a linked list (see §17.2.2). 

Examples: 

1. The following pseudocode specifies the ADT stack of D, where D is an arbitrary 
datatype. 

Domain 

sequence of D: generic 

Primary Operations 

create_stack ( S : stack) 

Comment: Initialize variable S as an empty stack. 

{pre: none} 

{post: S = A} 

push (. S : stack, x: element of D) 

Comment: Put value of x at top of stack S 
{pre: none} 

{post: S = x ■ S' 11 } 
pop (S: stack, x: element of D) 

Comment: Remove top item of stack S; return it as value of variable x. 

{pre: S ± A} 

{post: x ■ S = S' 11 } 
query .empty .stack ( S : stack) 

Comment: Decide whether stack S is empty. 

{pre: none} 

{post: query .empty .stack = (S = A)} 

2. The following pseudocode specifies the ADT queue of D , where D is an arbitrary 
datatype. 

Domain 

sequence of D : generic 
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Primary Operations 

ere ate .queue ( Q : queue) 

Comment: Initialize Q as an empty queue. 

{pre: none} 

{post: Q = A } 

enqueue ( Q : queue, x : element of D) 

Comment: Put x at the back of queue Q. 

{pre: none} 

{post: Q = ■ x} 

dequeue ( Q : queue, x: element of D) 

Comment: Delete front of Q\ return as x. 

{pre: Q ± A} 

{post: = x ■ Q} 

query .empty .queue ( Q : queue) 

Comment: Decide whether queue Q is empty. 

{pre: none} 

{post: query .empty .queue = (Q = A)} 

3. The following figure illustrates the difference between stacking (last-in-first-out) and 
queueing (first-in-first out). 

empty stack 
push A 
push B 
push C 
pop => C 
pop => B 
push D 


E 

tun 

B | A 


B | A 


E 

EE 


empty queue 
enqueue A 
enqueue B 
enqueue C 
dequeue => A 
dequeue => B E 
enqueue D | C | D | 


E 

EE 

A | B | C 


B C 


17.1.3 TWO-WAY SEQUENTIAL LISTS 

A two-way sequential list conceptualizes a linear list as having a current location, so 
that entries may be inserted or deleted only at the current location. 

Definitions: 

A two-way sequential list is a list with a designated location at which access is 
permitted. 

The current location of a two-way sequential list is the location at which access is 
permitted. 

The forepart of a two-way sequential list is the part preceding the current location, 
which is empty when the current location is at the start of the list. 

The aftpart of a two-way sequential list is the part following the current location, which 
is empty when the current location is at the finish of the list. 
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Facts: 


1 . A two-way sequential list does not maintain place-in-list numbers for the entries. The 
result of such an additional requirement would force the insert operation to renumber 
the part of the list following a newly inserted entry. This would slow the performance. 

2. A two-way sequential list is easily implemented as a pair of stacks. 

Example: 

1. ADT seq list of D 

Domain 

2-tuple (fore: sequence of D , aft: sequence of D) 
type D: generic 
Primary Operations 

createJist ( L : seqdist) 

Comment: Initialize an empty list L. 

{pre: none} 

{post: fore(L) = A A aft(L) = A} 
reset_to_start ( L : seqdist) 

Comment: Reset to start of list. 

{pre: none} 

{post: fore(L) = A A aft(L) = fore(L b ) o aft(L b )} 
advance (L: seqdist) 

Comment: Advance current position by one element. 

{pre: aft(L) ^ A} 

{post: (3x : D)[fore(L) = fore(l '}) ■ x A aft(L b ) = x ■ aft.(L)}} 
query _atstart ( L : seqdist) 

{pre: none} 

{post: query -at start = (fore(L) = A)} 
query atfinisli (L: seqdist) 

{pre: none} 

{post: query Mtfinish = (aft(L) = A)} 
insert ( L : seqdist, x: element of D) 

{pre: none} 

{post: aft(L) = x ■ aft(L b ) A fore (L) = fore(L^)} 
remove ( L : seqdist, x: element of D) 

{pre: aft(L) ^ A} 

{post: aft(l '}) = x ■ aft(L) A fore (L) = /ore(i t ’)} 
swap_right (L: seqdist, M: seqdist) 

{pre: none} 

{post: fore(L) = fore(L^) A fore(M) — fore(M b ) 

A aft(L) = aft(M b ) A aft(M) = aft(L b )} 


17.1.4 DICTIONARIES AND RANDOM ACCESS LISTS 
Definitions: 

A keyed pair is a 2-tuple whose first entry, which is called a key, is from an ordered 
datatype and is used to access data in the second entry. 

A table is a set of keyed pairs such that no two keys are identical. 
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A random access list is a table whose keys are consecutive integers. 

A dictionary is another name for a table. 

Facts: 

1. A static table (whose size does not change) can be implemented as an array. 

2. A dynamic table (which permits inserts and deletes) is often implemented as a binary 
search tree (see §17.2.4). 

3 . Specifying a datatype as a dictionary means that its primary retrieval operation can 
execute in 0(n) time. 

4 . Specifying a datatype as a random access list means that its primary retrieval op- 
eration can execute in 0(1) time. 

Examples: 

1. ADT table 

Domain 

set of table_entry 

type table.entry: 2-tuple (key: ordered, data: generic) 

Primary Operations 

create.table (T: table) 

{pre: none} 

{post: T = A} 

insert entry ( T : table, e: table_entry) 

{pre: (Ve' G T)[ key(e') ^ key(e)]} 

{post: T = T b U {e}} 

remove entry (T: table, e: table _entry) 

{pre: e G T] 

{post: T = T b - {e}} 

find_entry (T: table, k: key, found: boolean, e: table_entry) 

{pre: none} 

{post: ((3e' G T)[e' .key = k ]) A ( found = true) A (e = e')) 
v(-i(3e / G T)[e' .key = k] A found = false)} 

2. ADT Random access list 

Domain 

set of table_entry 

type table_entry: 2-tuple (key: subrange of integers, data: generic) 

Primary Operations 

Exactly the same as for the ADT table. 


17.1.5 PRIORITY QUEUES 

A priority queue is an “unfair queue”, in which entries are not dequeued on a first- 
enqueued basis. Instead, each entry has a priority, and is dequeued on a highest priority 
basis. 

Definition: 

A priority queue is a set of keyed pairs, such that the key of the entry returned by a 
dequeue operation is not exceeded by the key of any other entry currently in the queue. 
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Facts: 


1. A priority queue is usually implemented as a heap (see §17.2.5). 

2. Two different entries in a priority queue may have the same key. 

3. The operating system for a multi-user programming environment places computa- 
tional tasks into a priority queue. 

Example: 

1. ADT P_queue 

Domain 

set of Pq.entry 

type Pq.entry: 2-tuple (key: ordered, data: generic) 

Primary Operations 

create_Pq ( PQ : P .queue) 

{ pre: none } 

{ post: PQ = A } 

enPqueue(PQ: P .queue, e: Pq.entry) 

{ pre: none } 

{ post: PQ = PQ b U {e} } 
dePqueue ( PQ : P .queue, e: Pq_entry) 

{ pre: PQ ^ 0 } 

{ post: (e € PQ^) A (Ve 7 € PQ)[key(e ) < key(e')] A PQ = PQ b — {e} } 
query .empty _P queue ( PQ : P .queue ) 

{ pre: none } 

{ post: query .empty JPqueue = (PQ = A) } 


17.2 CONCRETE DATA STRUCTURES 

Concrete data structures configure computer memory into containers of related informa- 
tion. They are used to implement abstract datatypes. Contiguous stretches of memory 
are regarded as arrays, and noncontiguous portions are linked with pointers. 


17.2.1 MODELING COMPUTER STORAGE AND RETRIEVAL 

There are a few generic concepts common to nearly all concrete data structures. 

Definitions: 

A concrete data structure is a mathematical model for storing the current value of 
a structured variable in computer memory. 

A cell in a concrete data structure S' is a unit within the data structure that may 
contain data and pointers to other cells. 

The header of a concrete data structure is a special unit that contains current informa- 
tion about the entire configuration and pointers to some critical cells in the structure. 
It is not a cell. 
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An insert operation insert (S': structure, c: cell, loc : location) inserts a new cell c into 
structure S at location loc. 

A delete operation delete(S': structure, loc: location) deletes from a structure S the 
cell at location loc. 

A target predicate for a concrete data structure is a predicate that applies to the 
cells. 

A find operation find(S: structure, t: target, loc: location) searches a structure S for a 
cell that satisfies target predicate t. It returns false if there is no such cell. In addition 
to returning the boolean value true if there is such a cell, it also assigns to its location 
parameter loc the location of such a cell. 

A next operation next (5: structure, loc: location) returns the boolean value true if 
the structure S is nonempty, in which case it also assigns to its location parameter loc 
the location of whatever cell it regards as next; it returns false if S is empty. 

The size of a cell is the number of bytes of computer memory it occupies. 

A pointer to a cell is a representation of its location in computer memory. 

A null pointer is a pointer that points to an artificial location. Detecting a null pointer 
is a signal to an algorithm to react somewhat differently than to a pointer to an actual 
location. 

Facts: 

1. There may be several alternative suitable concrete data structures that can be used 
to implement a given abstract datatype. 

2. If the records of a database are all of the same fixed size, then the records themselves 
may be in the cells of a concrete data structure. 

3. If the size of records is variable, then the cells of the concrete data structure often 
contain pointers to the actual data, rather than the data itself. This permits faster 
execution of operations. 

4. The most common form of target predicate for a concrete data structure is an 
assertion that a key component of the cell matches some designated value. 


17.2.2 ARRAYS AND LINKED LISTS 
Definitions: 

An array is an indexed sequence of identically structured cells (a ;/ | j = d, ... ,u), with 
consecutive indices. 

An array is zero-based if its lowest index is zero. 

A one-way linked list is a set of cells, each with one pointer, such that: 

• exactly one of these cells is pointed to by the header but by no cell; 

• exactly one cell contains a null pointer; 

• the sequence of cells formed by following the pointers, starting from the header, 

traverses the entire set, ending with the cell containing the null pointer. 

The far end of a one-way linked list is the cell that contains a null pointer. 

The near end of a one-way linked list is the cell that is pointed to by the header 
and by no other cell. 
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A two-way linked list is a set of cells, each with two pointers, one designated as its 
forward pointer and the other as its backward pointer, plus a header with a forward 
pointer and a backward pointer, such that: 

• considering only the forward pointers, it is a one-way linked list; 

• following the sequence of backward pointers yields the reverse of the sequence 

obtained by following the forward pointers. 

A sparse sequence is a sequence in which nearly all the entries are zeros. 

A circular linked list is a set of cells, each with two pointers, one designated as its 
forward pointer and the other as its backward pointer, plus a header with one or more 
pointers to current cells, such that: 

• the sequence of cells formed by following the forward pointers, starting from any 

cell, traverses the entire set and returns to the starting cell; 

• the sequence of cells formed by following the backward pointers, starting from 

any cell, traverses the entire set and returns to the starting cell. 


Facts: 

1. A random-access list (§17.1.4) can be implemented as an array so that a find op- 
eration executes in 0(1) time. 

2. A stack (§17.1.2) can be implemented as a one-way linked list with its top at the 
near end, so that push and pop both execute in 0(1) time. 

3 . A queue (§17.1.2) can be implemented as a two-way linked list with its back at the 
near end of the forward list and its front at the far end, so that enqueue and dequeue 
both execute in 0(1) time. 

4 . A two-way sequential list (§17.1.3) can be implemented as a two-way linked list, or 
as a pair of one-way linked lists. 


Examples: 

1. The following figure illustrates an array with cells ad, ■ ■ ■ ,a u . 


I L=J 

root A 


I I I 

a d a d+1 


I I I 

au-i au 


2 . The following figure illustrates a one-way linked list, with cell ad at the near end 
and cell a v at the far end. 


1 

root A X 


I liJ 


I LiJ 


\ 


I I*J 

a d+1 3u-1 


3 . The following figure illustrates a two-way linked list. 


I 

root A 


1 — “j* 

aa “i twit:;:. 

a d+1 
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Algorithm 1: BSTsearch(T, t). 

input: a binary-search tree T and a target key t 

output: if t £ T, the address of the vertex with key t, else the address where t 
could be inserted 

if root(T) = NULL then return address of root 
else if t < key (root) then BSTsearch (leftsubtree(T),t) 
else if t = key(root) then return address of root 
else BSTsearch (rightsubtree(T),t) 


4. Representing a sparse finite sequence by a linked list can save space. The cell given 
to each nonzero entry includes its position in the sequence and points to the cell with 
the next nonzero entry. 

5. A queue whose maximum length is bounded can be represented by a circular linked 
list with two header pointers, one to the back and one to the front of the queue. The 
number of cells equals the maximum queue length. This eliminates the need for “garbage 
collection” . 


17.2.3 BINARY SEARCH TREES 
Definitions: 

A tree structure is a concrete data structure such that the header points to a single 
cell, and such that from that cell to each other cell, there is a single chain of pointers. 

The root cell of a tree structure is the cell to which the header points. 

A binary tree structure is a tree structure such that each cell has two pointers. 

The left child of a cell in a binary tree structure is the cell to which the first pointer 
points. 

The right child of a cell in a binary tree structure is the cell to which the second 
pointer points. 

A binary search tree structure is a binary tree structure in which for every cell, all 
cells accessible through the left child have lower keys, and all cells accessible through 
the right child have higher keys. 

Facts: 

1. The ADT table is commonly implemented as a binary search tree structure. 

2. The average running time for the ADT table operations of insertion , deletion , and 
find is O(logn). The time may be worse if relatively few cells have two children. 

3. Using a 2-3 tree structure instead of a binary search tree structure for the ADT 
table operations reduces the worst case running time from 0(n) to O(logn). 

4. Algorithm 1 can be used in the binary search tree operations of finding, inserting, 
and deleting. 
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5. To find a target key t in the binary search tree T, first apply BSTsearch(T, t). If 
the address of a null pointer is returned, then there is no node with key t. Otherwise, 
the address returned is a node with key t. 

6. To insert a node with target key t into the binary search tree T, first apply the 
algorithm BSTsearch(T, t). Then install the node at the location returned. 

7. To delete node t from the binary search tree T, first apply BSTsearch(T, t). Then 
replace node t either by the node with the largest key in the left subtree or by the node 
with the smallest key in the right subtree. 

Examples: 

1. The following figure illustrates a binary search tree. 



2. Inserting 32 into the BST of Example 1 yields the following BST. 



3. Deleting node 10 from the BST of Example 1 would yield one of the following two 
BSTs. 




17.2.4 PRIORITY TREES AND HEAPS 
Definitions: 

A binary tree is left-complete if it is complete or if it is a balanced binary tree (§9.1.2) 
such that at depth one less than the maximum, the following conditions hold: 

• all nodes with two children are to the left of all nodes with one or no children; 

• all nodes with no children are to the right of all nodes with one or two children; 

• there is at most one node with only one child, which must be a left-child. 
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Algorithm 2: PriorityTreeEnqueue (T,x). 
input: a priority tree T and a new entry x 

output: tree T with the new vertex x inserted so that it remains a priority tree 

install entry x into the first vacant spot in the left-complete tree T 
while x ^ root(T) and priority(x) > priority(parent(x)) 
swap x with parent(x) 


A priority tree is a left-complete binary tree, with the following additional structure: 

• each vertex has an attribute called a key; 

• the values of the keys are drawn from a partially ordered set; 

• no vertex has a higher priority key than its parent. 

A heap is a representation of a priority tree as a zero-based array, such that each 
vertex is represented at the location in the array whose index equals its location in the 
breadth-first-search order of the tree. Thus: 

• index{root) = 0; 

• index (leftchild(v )) = 2 x inde x(v) + 1; 

• index(rightchild(v )) = 2 x index (v) + 2; 

• index(parent(v)) = y’ ndex b>)~ 1 j _ 

Enheaping an entry into a heap means placing it into a correctly prioritized position. 
Trickle- up means enheaping by Algorithm 2. 

Deheaping an entry from a heap means taking the root as the deheaped entry and 
patching its left subtree and its right subtree back into a single tree. 

Trickle-down means deheaping by Algorithm 3. 

A Fibonacci heap is a modification of a heap, using the Fibonacci sequence, that 
permits more efficient implementation of a priority queue than a heap based on a left- 
complete binary tree. 

Facts: 

1. Worst-case execution time of the priority tree enqueueing algorithm, Algorithm 2, 
is in the class 0(logn). 

2. Worst-case execution time of the priority tree dequeueing algorithm, Algorithm 3, 
is in the class 0(logn). 

Examples: 

1. This is a left-complete binary tree. 
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Algorithm 3: PriorityTreeDequeue (T). 
input: a priority tree T 

output: tree T — root(T) with priority-tree shape restored 

replace root(T) by rightmost entry y at bottom level of T 
while y is not a leaf and [priority(y) < priority (le ftchild(y)) or 
priority (y) < priority (rightchild(y))\ 
if priority(leftchild(y)) > priority(rightchild(y)) 
then swap y with leftchild(y) 
else swap y with rightchild(y) 


2. The following is a priority tree of height 3. 



3. The following figure illustrates a priority tree insertion. It shows how 45 is inserted 
into the priority tree of Example 2 in the correct location to maintain the left-compete 
binary tree shape and then rises until the priority property is restored. 



4. The following figure illustrates a priority tree deletion. It shows how the left- 
complete binary tree shape and priority property are restored after the root is removed 
from the priority tree of Example 2. 
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5 . The following heap corresponds to the priority tree of Example 2. Observe that the 
keys occur in the array according to the breadth- first-search order of their vertices. 



index 

0 

1 

2 

3 

4 

5 

6 

7 

8 9 

key 

47 

42 

16 

28 

36 

IT 

10 

4 

14 32 


17.2.5 NETWORK INCIDENCE STRUCTURES 


Definitions: 

An incidence matrix for a graph is a 0-1 matrix that specifies the incidence relation. 
The rows are indexed by the vertices and the columns by the edges. The entry in the 
row corresponding to vertex v and edge e is 1 if v is an endpoint of e, and 0 otherwise. 

An endpoint table for a graph (§8.1) is a dictionary whose keys are the edges. The 
data component for each key edge is the set of endpoints for that edge. If an edge is 
directed, then its endpoints are marked as head and tail. 

An edge-incidence table is a dictionary whose keys are the vertices of a graph or 
digraph. The data component for each key vertex is a list of all the edges that are 
incident on that vertex. Each self-loop occurs twice in the list. 

A two-way incidence structure for a graph is a pair consisting of an edge-incidence 
table and an endpoint table. 

Facts: 

1 . The time required to insert a new vertex v into a two-way incidence structure for a 
graph with n vertices and m edges is in ©(logn). By way of contrast, the time for an 
incidence matrix is in 0(n • to). 

2 . The time required to delete a vertex v from a two-way incidence structure for a 
graph with n vertices and to edges is in @(logn + deg(u)). By way of contrast, the time 
for an incidence matrix is in @(n • to). 

3 . The time required to insert a new edge e into a two-way incidence structure for a 
graph with n vertices and to edges is in ©(log to). By way of contrast, the time for an 
incidence matrix is in ©(to • ?r). 

4 . An edge-incidence table can represent an imbedding of a graph on a surface as a 
rotation system (§8.8.3). 
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Example: 

1. The following graph corresponds to the network incidence structure given below. 

a 


u 



EDGE-INCIDENCE TABLE 

u. a b d 

v. beef 

w. a c g 

x. d e h 

y- f 9 h 


ENDPOINT TABLE 

a. u w 

b. u v 

c. v w 

d. u x 

e. v x 
/■ V y 

g. w y 

h. x y 


17.3 SORTING AND SEARCHING 

Since commercial data processing involves frequent sorting and searching of large quanti- 
ties of data, efficient sorting and searching algorithms are of great practical importance. 
Sorting and searching strategies are also of fundamental theoretical importance, since 
sorting and searching steps occur in many algorithms. Table 1 compares the perfor- 
mance of some of the most common sorting methods. 


17.3.1 GENERIC CONCEPTS FOR SORTING AND SEARCHING 
Definitions: 

A database is a set of entries, stored in a computer as an information structure. 

An entry in a database is a 2-tuple whose first component is a key and whose second 
component is some data. 

A key in a database entry is a value from an ordered set, used to store and retrieve 
data. 

The key domain is the ordered set from which keys are drawn. 

The primary key is the component of highest precedence, when the key for database 
records has more than one component. 

The secondary key is the component of next highest precedence after the primary 
key. 

A record is another name for a database entry. 
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Table 1 Comparison of sorting methods. 


sorting method 
(grouped by type) 

average time factors 

comments 

expanding a sorted subsequence 

selection sort 

JV 2 

« — comparisons 

ss N exchanges 


insertion sort 

N 2 

~ 75- comparisons 
« A (, 2 exchanges 

linear if input file is 
“almost sorted” 

Shellsort 

< A " 3 / 2 comparisons 

for “good” increments 
1,4,13,40,131,... 

exchanging out-of-order pairs 

bubblesort 

N 2 

~ — comparisons 

« ^7 exchanges 

one pass if input file 
is already sorted 

sinking sort 

N 2 

« — comparisons 
~ ^7 exchanges 

one pass if input file 
is already sorted 

shakersort 

N 2 

~ — comparisons 

« exchanges 


heapsort 

< 2N lg N comparisons 

always 0 ( N log N) 

di vide-and-conq uer 

mergesort 

« N lg N comparisons 

always Q(N\ogN) 

quicksort 

« 2N lg N comparisons 

worst-case 

sorting by distribution 

rank counting 

0(AO 


radix sort on fc-digit key 

ss N lg N comparisons 



Sorting is the process of arranging a collection of database entries into a sequence that 
conforms to the order of their keys. 

Searching a database means using a systematic procedure to find an entry with a 
key designated as the objective of the search. 

Scanning a database (or a portion of a database) means examining every record in 
that database (or portion). 

The target of a database search is an entry whose key that has been designated as the 
objective of the search. 

A comparison sort is any sorting method that uses only comparisons of keys. 

An internal sorting method keeps all the entries in the primary memory of the 
computer during the process of rearrangement. 

An external sorting method uses external storage outside the main memory during 
the sorting process. 
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An in-place realization of a sorting method uses beyond the space needed for one 
copy of each data entry, only a constant amount of additional space, regardless of the 
size of the list to be sorted. 

A dynamic structure for a database is an information structure whose configuration 
may change during an algorithmic process, for instance, by the insertion or deletion of 
elements. 

A static structure for a database is a data structure whose configuration does not 
change during an algorithmic process. 


Facts: 

1. Several different general strategies for sorting are given in the following subsections. 
Each leads to more than one method for sorting. 

2 . Some elementary sorting methods take 0(n 2 ) time. Most practical comparison 
sorting methods require 0(?rlog?z) time. 

3 . The worst case running time of any comparison sort is at least fi(?rlogn). 

Examples: 

1. Selection sort (§17.3.2), insertion sort (§17.3.2), Shellsort (§17.3.2), bubblesort (§17.- 
3.3), heapsort (§17.3.3), and quicksort (§17.3.4) are all internal comparison sorts. 

2. Mergesort (§17.3.4) is a comparison sort that may be either internal or external. 

3 . Database model for a telephone directory: Each entry has as key the name of a 
person and as data that persons’s telephone number. The target of a search is the entry 
for a person whose number one wishes to call. Names of persons form an ordered key 
domain under a modified lexicographic ( “alphabetic” ) ordering, in which it is understood 
that a family name (a “last name” in European-based cultures) has higher precedence 
than a given name. 

4 . Database model for a reverse telephone directory: In a reverse telephone directory 
entry the key is a telephone number and the data is the name of the person with that 
number. This permits the telephone company to retrieve the name of the person who 
has a particular phone number, for instance, if someone inquires why some particular 
telephone number occurs on a long-distance phone bill. 

5 . Database model for credit-card information: In a credit-card database, the key to 
each entry is a credit-card number, and the data include the name of the cardholder, 
the maximum credit limit, and the present balance. 


17.3.2 SORTING BY EXPANDING A SORTED SUBSEQUENCE 

One general strategy for sorting iteratively expands a sorted subsequence, most often 
implemented as an array or a linked list, until the expanded subsequence ultimately 
contains all the entries of the database. 

Definitions: 

A selection sort iteratively transforms an unsorted input sequence into a sorted output 
sequence. At each iteration, it selects the item with smallest key from the remaining 
items in the unsorted input sequence and appends that item at the end of the sorted 
output sequence. 
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An insertion sort iteratively transforms an unsorted input sequence into a sorted 
output sequence. At each iteration, it takes the first remaining item from the unsorted 
input subsequence and inserts it into its proper position in the sorted output sequence. 

A Shellsort of an unsorted sequence cti ,...,a n is based on a list of increments of 
decreasing size: hi > ft 2 > • • • > ht = 1. On the kth iteration, the sequence is partitioned 
into hk subsequences, such that for j = 1 , . . . , hk, the jth subsequence is 

( a j+rh k | 0 < r < 2=*) 

and each of these j subsequences is sorted by an insertion sort. 

Facts: 

1. Selection sorts and insertion sorts both have time-complexity 0(n 2 ) in the worst 
case. 

2. The time-complexity of a selection sort is independent of the order of the input 
sequence, since finding the smallest remaining item requires scanning all the remaining 
items. 

3. The running time of an insertion sort can be significantly reduced for “almost sorted” 
sequences, with time 0{ri) as the limiting case. 

4. Optimizing the running time of a Shellsort involves some very difficult mathematical 
problems, many of which have not yet been solved. In particular, it is not known which 
choice of increments yields the best result. 

5. It is known that Shellsort increments should not be multiples of each other, if the 
objective is to achieve fast execution. 

6. Evidence supporting the efficiency of the Shellsort increment list . . . , 40, 13, 4, 1 ac- 
cording to the rule hi - 1 = 3/q + 1 is given by Knuth [Kn73]. The increment list 
. . . , 15, 7, 3, 1 satisfying the rule hi - 1 = 2hi + 1 is also recommended. 

7. Shellsort is a refinement of a straight insertion sort. The motivation for its design 
in 1959 by D. L. Shell is based on the observation that an insertion sort works very fast 
for “almost sorted” sequences. 

8. Shellsort is guaranteed to produce a sorted list, because on the last pass, it applies 
an insertion sort to the whole sequence. 

9. An in-place realization of the strategy of expanding a sorted subsequence concep- 
tually partitions the array into a sorted subsequence at the front of the array A[l..n] 
and an unsorted subsequence of remaining items at the back. Initially, the sorted sub- 
sequence is the empty sequence and the unsorted subsequence is the whole list. At each 
step of the iteration, the sorted front part expands by one item and the unsorted back 
part contracts by one item. 

Algorithms: 

1. Algorithm 1 is an in-place realization of a selection sort. 

2. Algorithm 2 is an in-place realization of an insertion sort. 

Examples: 

In the following examples of single-list implementations of SelectionSort and Insertion- 
Sort, the symbol “ | ” separates the sorted subsequence at the front from the remaining 
unsorted subsequence at the back. The arrows ' ” and >” indicate how far the in- 
dex j moves during an iteration. 
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Algorithm 1: SelectionSort of array A[\..n}. 

for i := 1 to n — 1 do 

minindex := i; minkey := A[i\ 

for j := i + 1 to n do 

if A[j] < minkey then minindex := j; minkey := A[j\ end-if 
swap A[i\ with A [minindex] 


Algorithm 2: InsertionSort of array A[l..n]. 

for * 2 to n do 

nextkey := A[i\; j := i — 1 

while j > 0 and A[j] > nextkey do A[j + 1] := A [j] ; j := j — 1 end- while 
A[j + 1] := nextkey 


1. On the sequence 15,8, 10,6, 13, 17, SelectionSort would progress as follows: 


minkey 

15 8 10 6 13 17 

. search for minkey 
l J > 

minkey 

6 8 10 | 15 13 17 
i j — » 


minkey 

6 | 8 10 15 13 17 
i j > 

minkey 

6 8 10 13 | 15 17 

i j 


minkey 

6 8 | T6 15 13 17 
i j » 

6 8 10 13 15 17 | 


2. On the sequence {15, 8, 10, 6, 13, 17}, InsertionSort would progress as follows: 


shift 

15] 8 10 6 13 17 
3 i 


shift 

8 15 f 10 6 13 17 
3 i 


shift shift shift 


8 10 15 | 6 13 17 
3 i 


shift 

6 8 10 15 f 13 17 6 8 10 13 15 | 17 6 8 10 13 15 17 | 

<- j i 3 i 

3. If n = 13 and /13 = 4, then on the third iteration, ShellSort would insertion sort the 
following subsequences: 

ai, <25, ag, ai3 
02, 06, Oio 
a 3, a 7> a ll 

04, 08, Oi2 


17.3.3 SORTING BY EXCHANGING OUT-OF-ORDER PAIRS 

A standard measure of the totality of disorder of a sequence of n items is the number 
of pairs (a,:,Oj) such that i < j but > aj. Thus, the disorder ranges from 0 (i.e., 
totally ordered) to Q) (i.e., in reverse order). The strategy of exchange sorts is to swap 
out-of-order pairs until all pairs are in order. 
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Definitions: 

A bubblesort scans an array A[l..n] repeatedly from the highest index to lower indices, 
each time swapping every out-of-order pair of consecutive items that is encountered. 

A sinking sort scans an array A[l..n] repeatedly from the lowest index to higher indices, 
each time swapping every out-of-order pair of consecutive items that is encountered. 

A shakersort scans an array A[l..n] repeatedly, and alternates between bubbling up- 
ward and sinking downward on alternate scans. 

A bubblesort, sinking sort, or shakersort is adaptive if it stops the first time a scan 
produces no transpositions. 

Heapsort sorts a sequence of entries by iteratively enheaping them all into a heap 
(§17.2.4) and then iteratively cleheaping them all. The order in which they deheap is 
sorted. 

Facts: 

1. The name “bubblesort” suggests imagery in which lighter items (i.e. , earlier in the 
prescribed order of the key domain) bubble to the top of the list. 

2 . The name “sinking sort” suggests that heavier items sink to the bottom. 

3 . The name “shakersort” suggests a salt shaker that is turned upside down. 

4 . Since each swap during an exchange sort reduces the total disorder, it follows that 
each scan brings the list closer to perfect order. By transitivity of the order relation, 
it follows that if every consecutive pair in a sequence is in the correct order, then the 
entire sequence is in order. 

5 . After the first pass of a bubblesort from bottom to top, the smallest element is 
certain to be in its correct final position at the beginning of the list. After the second 
pass, the second largest element must be in its correct position, and so on. 

6. Bubblesort has worst-case time complexity 0(n 2 ). 

7 . For “almost sorted” sequences, an adaptive bubblesort can run much faster than 
0(n 2 ) time. 

8. The priority property implies that the root of a priority tree is assigned the data 
entry with first precedence. 

9 . Whereas a sequence of length n has (!jj pairs that might be out of order, a binary 
tree of n elements has at most n log n pairs that could be out of order, if one compares 
only those pairs such that one node is an ancestor of the other. 

10 . Heapsort improves upon the idea of bubblesort because it bubbles only along tree 
paths between a bottom node and the root, instead of along the much longer path in a 
linear sequence from a last item to the first. 

11 . Heapsort runs in O(nlogn) time. 

12 . Heapsort was invented by J. W. J. Williams in 1964. 

Algorithms: 

1. Algorithm 3 is an adaptive version of bubblesort. 

2. Algorithm 4 is a heapsort algorithm. 
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Algorithm3: BubbleSort of array A[l..n]. 

first := 1; last := n ; exchange := true 
while exchange do 

first := first + 1; exchange := false; 
for i := last to first with step —1 do 

if A[i] < A[i — 1] then {swap A[i] and A[i — 1]; exchange := true} 


Algorithm 4: HeapSort of array A[0..n] into array B[0..n]. 
procedure heapify(i) 

if (A[j] is not a leaf) and (a child of A[i\ is larger than A[i\) then 
let A[k] be the larger child of A[i] 
swap A[i] and A[k ] 
heapify(/c) 

procedure buildheap 
for i := [f J downto 0 do 

heapify(i) 

main program heapsort 

buildheap 

for i = 0 to n do 

cleheap root of A and transfer its value to B [?r — i] 


Examples: 

1. When canceled checks are returned to the payer by a bank, they may be in nearly 
sorted order, since the payees are likely to deposit checks quite soon after they arrive. 
Thus, they arrive for collection in an order rather close to the order in which they are 
written. A shakersort might work quite quickly on such a distribution. 


2. Starting with the unsorted list L = 15, 8, 10, 
following sequence of lists. 

6,17,13, 

bubblesort would produce the 


initial list : 15 

8 

10 

6 

17 

13 


after one pass : 6 

15 

8 

10 

13 

17 


after two passes : 6 

8 

15 

10 

13 

17 


after three passes : 6 

8 

10 

15 

13 

17 


after four passes : 6 

8 

10 

13 

15 

17 

3. Starting with the unsorted list L = 15, 
the following sequence of lists. 

8,10,6,17,13, sinking sort would produce 


initial list : 15 

8 

10 

6 

17 

13 


after one pass : 8 

10 

6 

15 

13 

17 


after two passes : 8 

6 

10 

13 

15 

17 


after three passes : 6 

8 

10 

13 

15 

17 
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4. Starting with the unsorted list L = 15, 8, 10, 6, 17, 13, shakersort would produce the 
following sequence of lists: 


initial list : 

15 

8 

10 

6 

17 

13 

after one pass : 

6 

15 

8 

10 

13 

17 

after two passes : 

6 

8 

10 

13 

15 

17 


17.3.4 SORTING BY DIVIDE-AND-CONQUER 


The strategy of a divide-and-conquer sort is to partition the given sequence into smaller 
subsequences, to sort the subsequences recursively, and finally to merge the sorted 
subsequences into a single sorted sequence. 

Definitions: 

A top-down mergesort splits the input sequence into two equal (or nearly equal) 
sized subsequences, recursively mergesorts the two subsequences, and finally merges the 
two sorted subsequences into a single sorted sequence. 

A bottom-up mergesort initially regards each entry in its input sequence as a list of 
length one. It merges two consecutive pairs at a time into lists of length two. Then it 
merges the lists of length two into lists of length four. Ultimately, all the initial items 
are merged into a single list. 

A quicksort selects an element x (called the pivot) in the input list and splits the input 
list into two subsequences Si and S 2 such that every element in Si is no larger than x 
and every element in S 2 is no smaller than x. Next it recursively sorts Si and S 2 . Then 
it concatenates the two sorted subsequences into a single sorted sequence. 

The pivot in a quicksort iteration is the element x at which the sequence is split. 

Facts: 

1. A top-down mergesort is usually implemented as an internal sort. 

2. A bottom-up mergesort is a common form of external sort. 

3 . An outstanding merit of quicksort is that it can be performed quickly within a single 
array. 

4. Quicksort was first described by C.A. R. Hoare in 1962. 

5. The running time of a mergesort is 0(n log n). 

6. In the worst case, a quicksort takes time fl(n 2 ). 

7 . Choosing the quicksort pivot at random tends to avoid worst case behavior. 

8. The average running time for a quicksort is O(nlogn). 

9 . External sorting is used to process very large files, much too large to fit into the 
primary memory of any computer. 

10 . The emphasis in devising good external sorting algorithms is on decreasing the 
number of times the data are accessed because the time required to transfer data back 
and forth between the the primary memory and the tape usually outweighs far the time 
required to perform comparisons on data in the primary memory. 
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Algorithm 5: Merge two sequences, 
procedure merge(A[l..m], B[l..h], C[ ]) 

{merge two sorted sequences A and B into a single sorted sequence C} 
%A := 1; ib '■= 1; ic ■= 1 
while Ia < to and i B < ft do 
if A[i A ] < B[i B \ then { C[i c ] ■= A[i A ]\ i A ■= + 1} 

else { C[i c \ ■= B[i B \ ; %b ~ is + 1} 
ic := + 1 

if i A > to then move the remaining elements in B to C 
else move the remaining elements in A to C 


Algorithm 6: MergeSort S. 

procedure mergesort(S') 
if length(S) < 1 then return else 

split S into two (equal or nearly equal)-sized subsequences Si and S 2 
mergesort Si 
mergesort S 2 
merge(Si,S 2 ) 


Algorithm 7: External MergeSort sequence S of length n. 

for i := 1 to [logn] 
for j := 1 to {^1 

merge next sublist from input A with next sublist from input B , 
writing merged sublist onto output tape C 
merge next sublist from input A with next sublist from input B , 
writing merged sublist onto output tape D 
reset output tape C as input tape A and vice versa 
reset output tape D as input tape B and vice versa 


1 1 . Formal algorithms and more detailed discussions of external can be found in [Kn73] . 

12. In an external sort, the number of times each element is read from or written 
to the external memory is log(^) + 1, where in is the available internal memory size. 
Improvements on the construction of runs as well as on the merging process are possible 
(see [Kn73]). 

Algorithms: 

1. Algorithm 5 merges two sorted sequences into a single sorted sequence. 

2. Algorithm 6 mergesorts a sequence internally. 

3. In a typical external mergesort such as Algorithm 7, there are two input tapes and 
two output tapes. The entries are initially arranged onto the two input tapes, with half 
the entries on each tape, and regarded as sublists of length one. A sublist from the first 
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Algorithm 8: Quicksort. 

procedure split©, S ) 

for each element y in S do 

if x > y then put y in Si else put y in S 2 

main program 

if length(S) < 1 then return else 

choose an arbitrary element x in sequence S 
split©, S ) into S 1 and S 2 
recursively sort Si and S 2 
concatenate the two sorted subsequences 


input tape is merged with a sublist from the second input tape and written as a sublist 
of doubled length onto the first output tape. Then the next sublist from the first input 
tape is merged with the next sublist from the second input tape and written as a sublist 
of doubled length onto the second output tape. The alternating process is iterated until 
the sublists from the input tapes have all been merged into sublists of doubled length 
onto the two output tapes. Then the two output tapes become input tapes to another 
iteration of the merging process. This continues until all the original entries are in a 
single list. 

4. The generic quicksort algorithm Quicksort (Algorithm 8) does not specify how to 
select a pivot. 

Example: 

1. The following illustrates MergeSort on the sequence S = 21, 6, 8, 11, 10, 17, 15, 13. 



(split) 

(split) 

(split) 

(merge) 

(merge) 

(merge) 
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Algorithm 9: RankCountingSort of array A[l..] into array B[l..n\. 

{pre : max{A[i ]) < cn} 

for i := 1 to cn do C[i\ := 0 

for j := 1 to n do C'[A[j]] := 1 

for i := 2 to cn do C[i\ := C[i } + C[i — 1] 

for j := 1 to n do B[C[A[j]]] := A[j ] 


17.3.5 SORTING BY DISTRIBUTION 

Prior knowledge of the distribution of the elements of the input sequence sometimes 
permits sorting algorithms to break the lower bound of fi(nlogn) for running time of 
comparison sorts 

Definitions: 

The rank of an element of a finite ordered set is the number of elements that it exceeds 
or equals. 

A rank counting sort calculates the “rank” for each element, and then assigns the 
elements directly to their correct position according to their rank. 

In a base-ten radix sort, the keys are base-ten integer numerals with at most k digits. 
Each entry is appended to one of ten queues Qo, ■ ■ ■ , Qg, according to the value of its 
least significant digit, after which the list Qo ° • • • ° Qg is formed by concatenation. The 
concatenated list is then similarly separated into ten queues, according to the values of 
the next least significant digit. This process is iterated up to the most significant digit. 

A radix sort is a sort like the base-ten radix sort, using an arbitrary radix, not neces- 
sarily ten. 

Facts: 

1. A rank counting sort gives favorable results when the input keys are n different 
positive integers, all bounded by cn for some constant c. 

2. The running time of a rank counting sort is in 0(n). The RankCountingSort (Al- 
gorithm 9) can be modified so that it sorts in linear time even when the input elements 
are not all distinct. [CoLeRi90] 

3. It can be proved that a radix sort correctly sorts the input. [Kn73] 

4. The running time of RadixSort is bounded by O(fcn), where k is the maximum 
number of digits in a key. When k is a constant independent of n, RadixSort sorts in 
linear time. Note, however, that if the input consists of n distinct numbers and the base 
of the numbers is fixed, then k is of order fi(logn). 

Algorithms: 

1. In the rank counting sort, Algorithm 9, the array A contains n input elements, the 
array B is the output array, and the array C is an auxiliary array of size cn used for 
counting. Step 3 causes count C[A[j] ] to be the rank of entry A[j\. 

2. The base-ten radix sort, Algorithm 10, starts with an input list A whose keys have 
at most k digits. 
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Algorithm 10: RadixSort of array A[l..n\. 
for d, := 1 to k do 

for i := 0 to 9 do make Qi an empty queue 
for j := 1 to n do 

let h be the jth digit of A[j] 
append A[j] to queue Qh 
A := Qo o ■ ■ ■ o Qa (concatenation) 


17.3.6 SEARCHING 


Definitions: 

Searching a database means seeking either a target entry with a specific key or a 
target entry whose key has some specified property. 

Linear search is the technique of scanning the entries of a list in sequence, either until 
some stopping condition occurs or the entire list has been scanned. 

Binary search is a recursive technique for seeking a specific target entry in a list. The 
target key is compared to the key in the middle of the list, in order to determine which 
half of the list could contain the target item, if it is present. 

Hashing is storage-retrieval in a large table in which the table location is computed 
from the key of each data entry. (§17.4) 

A binary search tree is a binary tree in which each note has an attribute called its 
key, and the keys are elements of an ordered datatype (e.g., the integers or alphabetic 
strings). Moreover, at each node v the key is larger that all the keys in its left subtree, 
but smaller than all the keys in its right subtree. 

A 2-3 tree is a tree in which each non-leaf node has 2 or 3 children, and in which every 
path from the root to a leaf is of the same length. 

An AVL tree is a binary search tree with the property that the two subtrees of each 
node differ by at most 1 in height. 

Facts: 

1. Some common database search objectives are for a specified target entry, for the 
maximum entry, for the minimum entry, or for the /cth smallest entry. 

2. The performance of a dynamic database structure that permits insertions and dele- 
tions is measured by the time needed for insertions and deletions, as well as the time 
needed for searching. 

3. A binary search runs in average time O(logn) to search for a specified element x in 
a sorted list of n elements. 

4. In the worst case of searching by comparison-based algorithms for a specified target 
element in a sorted list of length n, fl(logn) comparisons are necessary. 

5. A randomly constructed ?r-node binary search tree has expected height of at most 
2 log n. 

6. An AVL tree of n nodes has depth O(logn). 
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Algorithm 1 1 : BinarySearch (A, L,U,x). 

{Look for x in A[L..U]. Report its position if found, else report 0.} 

if L = U then 

if x = A[L\ then return L else return 0 
else M := |_^J 

if x > A[M] then return BinarySearch (. A,M + l,U,x) 
else return BinarySearch ( A , L , M, x) 


Algorithm 12: 23TSearch(a:, r). 

case 1 : r is a leaf 

if r is labeled with x then return “yes” else return “no” 
case 2: r is not a leaf 

if x < L[r\ then return 23TSearch(a:, leftchild(r)) 

else if x < M[r\ then return 23TSearch(x, midchild (r)) 

else if x has a right child then return 23TSearch(x, rightchild(r)) 

else return “no” 


7. Insertion and deletion on an AVL tree, with patching if needed so that the result is 
an AVL tree, can be performed in O(logn) worst-case running time. [Kn73] 

8. AVL trees are named for their inventors, G. M. Adelson-Velskii and Y. M. Landis. 

9. A 2-3 tree for a set S of entries can be constructed by assigning the entries to the 
leaves of the tree in order of increasing key from left to right. Each non- leaf node v 
is labeled with two elements L[v\ and M[v], which are the largest keys in the subtrees 
rooted at its left child and middle child, respectively. 

10. The operations of searching, finding a maximum or minimum, inserting a new 
entry, and deleting an entry all execute within O(logn) time. 


Algorithms: 

1. To search for a specified target key x in a sorted list A[l..?r], a call to the recursive 
algorithm BinarySearch (A,l,n,x) in Algorithm 11 can be used. Its technique is to 
compare the target to the middle entry of the list and to decide thereby in which half 
the target might occur; then that half remains as the active portion of the list for the 
next iteration of the search step, while the other half becomes inactive. 

2. Find the maximum [minimum] in an unsorted list: Scan the list from start to finish 
and keep track of the largest [smallest] seen so far. 

3. Finding the maximum in a binary search tree: Start at the root and follow the 
right child pointers until some node has no right child. That node must contain the 
maximum. 

4. Finding the minimum in a binary search tree: Start at the root and follow the left 
child pointers until some node has no left child. That node must contain the minimum. 

5. Searching for a target entry x in a 2-3 tree: Start at the root, and use the keys at 
non-leaf nodes to locate the correct leaf, as described by Algorithm 12. 


© 2000 by CRC Press LLC 





Algorithm 13: Finding-The-fcth-Smallest. 

divide the n input elements into ["”] groups of five elements 
find the median for each of the |~jf] groups 
recursively find the median in* of these group medians 

partition the input into two sets <Si and S 2 such that each element in S\ is no 
larger than m* and each element in S 2 is no smaller than m* 
if Si has > k elements then recursively find the fcth smallest element in Si 
else recursively find the (k — |Si|)th smallest element in S 2 


6. Finding the maximum in a 2-3 tree : Starting from the root, follow the right-child 
pointers to the rightmost leaf, which contains the maximum entry. 

7. Finding the minimum in a 2-3 tree: Starting from the root, follow the left-child 
pointers to the leftmost leaf, which contains the minimum entry. 

8. To insert a new entry x into a 2-3 tree: First locate the non-leaf node v whose 
child x “should” be. If v is a 2-child node v, then simply install a; as a third child of v. 
If v already has three children, then let v keep as children the two smallest of the set 
comprising its three children and x. A new non-leaf node u becomes the parent of the 
largest member of this set. Now recursively insert node u as a new child to the parent 
of node v. If the process eventually makes the root of the tree a 4-chilcl node, then the 
last step is to create a new root with two new children, each of which has two of the 
four children of the former root. Note that the labels of some non-leaf nodes may be 
updated in this process. 

9. To delete an entry x from a 2-3 tree: Essentially, reverse the manner by which an 
element is inserted. First find the leaf v containing x. If the parent p of v has three 
children, then the leaf v is simply deleted. If p has only two children v and v ' , then 
select an adjacent sibling p' of p. If p' has only two children, then make v' a child of p ' , 
and recursively delete the node p from the tree. If p' has three children, then make an 
appropriate child of p' into a new child of p and delete the node v (note that now both p 
and p' have two children). Again the process may progress recursively to the root of 
the tree, such that it is necessary to delete one of the only two children of the root. In 
this case, delete the root and make the remaining child of the root into a new root of 
the tree. Labels of some non-leaf nodes may need to be updated in this process. 

10. Searching in a random list: Finding the fcth smallest element, for an arbitrary k, 
in a random list can also be done in linear time. The algorithm, Algorithm 13, is based 
on the method of “Prune and Search” . That is, the process first prunes away in linear 
time a constant factor of the elements in the input, then recursively searches the rest. A 
careful analysis shows that each of the two sets Si and S 2 contains at most ^ elements. 
Therefore, if T(n) is the running time for Algorithm 13 on input of n elements, it follows 
that T(n) < T(|) + T(j£) + 0(n). This relation gives T(n) = 0(n). 

Example: 

1. To search for the target 64 in the following 16-element list 
5, 8, 9, 13, 16, 22, 25, 36, 47, 49, 64, 81, 100, 121, 144, 169 
first split it into these two 8-element sublists, 

5, 8, 9, 13, 16, 22, 25, 36 47, 49, 64, 81, 100, 121, 144, 169 

and then compare 64 to the largest item in the first list. Since 64 > 36, it follows that 
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64, if present in the original list, would have to be in the second sublist. Next split the 
active sublist further into these two 4-element sublists 
47, 49, 64, 81 100, 121, 144, 169 

and then compare 64 to the largest item in the first new sublist. Since 64 < 81, it 
follows that 64, if present, would have to be in the second sublist. Therefore, resplit the 
active sublist further into these two 2-element sublists 
47, 49 64, 81 

and then compare target 64 to the largest item in the first new sublist. Since 64 > 49, 
it follows that 64, if present, would have to be in the second sublist. Therefore, resplit 
the active sublist further into these two 1-element sublists 
64 81 

and then compare 64 to the largest item in the first new sublist. Since 64 < 64, it 
follows that 64, if present, would have to be in the first sublist. Since that sublist has 
only one element, namely 64, the target 64 is compared to that one element. Since they 
are a match, the target has been located as the 11th item in the original list. 


17.4 HASHING 

Hashing, also known as “address calculation” and as “scatter storage” , is a mathematical 
approach to organizing records within a table. The objective is to reduce the amount 
of time needed to find a record with a given key. Hashing is best suited for “dynamic” 
tables, that is, for databases whose use involves interspersed lookups and insertions. 
Dynamic dictionaries (such as spelling checkers) and compiler-generated symbol tables 
are examples of applications where hashing may be useful. 


17.4.1 HASH FUNCTIONS 

Hashing is an approach to placing records into a table and retrieving them when needed, 
in which the location for a record is calculated by an algorithm called a hash function. 

Definitions: 

A record is a pair of the form ( k:key , d:data), in which the second component is data 
and the first component is a key used to store it in a table and to retrieve it subsequently 
from that table. 

A key domain is an ordered set, usually the integers, whose members serve as keys for 
the records of a table. No two different records have the same key. 

A hash table is an array, in which the location for storing and retrieving a record are 
calculated from the key of that record. 

A hash function h is a function that maps a key domain to an interval of integers 
[0..TO— 1]. The intent is that a record with key k is to be stored in or retrieved from the 
location h(k') in the table. 

A collision occurs when a hash function h assigns the same table location to two 
different keys Aq ^ Aq, he., when h(ki) = h(k 2 )- 
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Collision resolution is the process of finding an alternative location for a new record 
whose key collides with that of a record already in the table. 

The fullness of a hash table T is the ratio a(T) = — of the number n of records in the 
table to the capacity m of the table. 

Facts: 

1. Hashing is often used when the set of keys (of the records in the database) is not 
a consecutive sequence of integers or easily convertible to a consecutive sequence of 
integers. 

2. Keys that are non-numeric (or do not have a good random distribution) can be trans- 
formed into integers by using or modifying their binary representation. The resulting 
integers are called virtual keys. 

3 . It is desirable for a hash function to have the simplicity property , i.e., that it takes 
only a few simple operations to compute the hash function value. 

4. It is desirable for a hash function to have the uniformity property, i.e., that each 
possible location in the range 0,...,m — 1 of the hash function h: K — > [0..m— 1] is 
generated with equal likelihood, that is, with probability ^ . 

Examples: 

1. The division method h(k) = k mod m is a simple hash function that can be used 
when the keys of the records are integers that extend far beyond a feasible table size. 
The table size m must be chosen carefully to avoid high instance of collision, without 
wasting too much storage space. Selecting m to be a prime not close to a power of 2 is 
typically a good choice. 

2 . The multiplication method is another simple hashing rule. First the key (in base 2) 
is multiplied by a constant value A such that 0 < A < 1, and then p = logra bits 
are selected for the hash function value from somewhere in the fractional part of the 
resulting base-2 numeral, often required to be away from its low end. (This is similar 
to some methods for generating random numbers.) 

3 . As a simplified example of a multiplicative hash function, consider table size m = 16, 
address size p = log(16) = 4 bits, and keysize w = 6 bits. With fractional constant A = 
O.IOIOII 2 , use first four bits of the fractional part. For instance, given key k = OIIOII 2 , 
first calculate R = A ■ k = 010010.0010012- Then take k = OOIO 2 , the low-end four bits 
of the fractional part. Knuth [Kn73] suggests A = 0.6180339887... as a good choice of 
a multiplier. 


17.4.2 COLLISION RESOLUTION 
Definitions: 

Collision resolution is the process of computing an alternative location for a colliding 
record. The two basic methods are chaining and rehashing. 

Chaining is a collision resolution method that involves auxiliary storage space for 
records outside the confines of the main array. Each slot in the main table can be used 
as the root of a linked list that contains all the records assigned to that location by the 
hash function. Each additional colliding record is inserted at the front of the linked list. 
When searching for a record, the list is traversed until the record is found, or until the 
end of the list is reached. 
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The size of a chained hash table is the number of linked list headers (i.e. , the size of 
the main array). Thus, a chained hash table said to be of size m may be used to store 
a database with more than m records. 

Rehashing is a collision resolution method in which there is no auxiliary storage outside 
the main table, so that a colliding record must be stored elsewhere in the main table, 
that is, at a location other than that assigned by the hash function to its key k. A 
collision resolution function finds the substitute location. 

A collision resolution function under rehashing generates a probe sequence (ho(k) = 
h(k), hi(k), fi 2 (k), . . . , h m -i(k)). The new record is inserted into the first unoccupied 
probe location. When searching for a record, the successive probes are tried until the 
record is found or the probe finds an unoccupied location (i.e. unsuccessful search). 

A probe sequence for key k is a sequence (ho(k) , hi(k) , h, 2 (k ), . . . , h m -i{k)) of possible 
storage locations in the table T that runs without repetition through the entire set 
(0, 1, 2, . . . , m — 1) of locations in the table, as possible places to store the record with 
key k. 

Clustering is a hashing phenomenon in which after two keys collide, their probe se- 
quences continue to collide. 

Linear probing means trying to resolve a collision at location h(k ) with a probe 
sequence of the form hi(k) = (ho(k) + ci ) mod m. 

Quadratic probing means trying to resolve a collision at location h(k ) by using a 
probe sequence of the form fn(k ) = ( h 0 (k ) + cp + C 21 2 ) mod m. 

A secondary hash function is a hash function used to generate a probe sequence, 
once a collision has occurred. 

Double hashing means using a primary hash function h and a secondary hash func- 
tion h to resolve collisions, so that hi(k ) = ( h(k ) + ih ( k )) mod m. 

Facts: 

1. In designing a hash function, an objective is to keep the length of the probe sequences 
short, so that records are stored and retrieved quickly. 

2. Under chain hashing, inserting a record always requires 0(1) time. 

3. Under chain hashing, the time to find a record is proportional to the length of the 
chain from its key, and the average length of a chain in the table equals the fullness a 
of the table. 

4. Under chain hashing, if the number of records in the table is proportional to the 
table capacity, then a find operation needs 0(1) time, on average. 

5. Under chain hashing, a delete operation consists of a find with time proportional to 
table fullness, followed by removal from a linked list, which requires only 0(1) time. 

6. Analysis of rehashing performance is based on the following assumptions: 

• uniform distribution of keys: each possible key in the key domain is equally likely 

to occur as the key of some record; 

• uniform distribution of initial hash locations (see §17.4.1); 

• uniform distribution of probe sequences: each possible probe sequence 

(h 0 (k),h 1 (k),h 2 (k) 

1 • • • 1 hm — 1 m, 

regarded as a permutation on the set of all table locations is equally likely. 

7. Under rehashing, the expected time to perform an insert operation is the same as 
the expected time for unsuccessful search, and is at most 
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8. Under rehashing, the expected time E(a) to perform a successful find operation is 
at most 1 In Tpr;; + For instance, E(0.5) = 3.386, and E(0.9) = 3.670. That means 
that if a table is 90% full, a record will be found, on average, with 3.67 probes. 

9. Under rehashing, location of the deleted record needs to be marked as deleted so 
that subsequent probe sequences do not terminate prematurely. Moreover, the running 
time of a delete operation is the same as for a successful find operation. (It also causes 
the measure of fullness for searches to be different from that used for insertions, because 
a new record can be inserted into the location marked as deleted). However, in most 
applications that use hashing, records are never deleted. 

Examples: 

1. The following example of linear probing uses prime modulus m = 1013 and prime 
multiplier c = 367. The keys are taken to be social security numbers. 


key k 

h 0 (k ) 

hi(k) 

113548956 

773 


146834522 

172 


207639313 

651 


359487245 

896 


378545523 

592 


435112760 

896 

250 

670149788 

651 


721666437 

172 

539 

762456748 

12 



2. Linear probing suffers from clustering. That is, if hi(k i) = hj(k 2 ), then hi+ p (ki) = 

hj+ p {k 2 ) for all p = 1, 2, All probe sequences follow the same (linear) pattern, from 

which it follows that long chains of filled locations will cause a large number of probes 
needed to insert a new record (and to locate it later). 

3. Quadratic probing suffers from clustering. That is, if h 0 (ki) = h 0 {k 2 ), then hi(ki) = 

hi{k 2 ) for i = 1 , 2 , 

4. Pairing the primary hash function h{k) = k mod p with the secondary hash function 
h ( k ) = k div p , where p is a prime, yields the double hash function hi(k) = (ho(k) + 
ih ( k )) mod p. 


1 7.5 DYNAMIC GRAPH ALGORITHMS 


Dynamic graph algorithms are algorithms that maintain information in regard to prop- 
erties of a (possibly edge-weighted) graph while the graph is changing. These algorithms 
are useful in a number of application areas, including communication networks, VLSI 
design, distributed computing, and graphics, where the underlying graphs are subject 
to dynamic changes. Efficient dynamic graph algorithms are also used as subroutines 
in algorithms that build and modify graphs as part of larger tasks, e.g., the algorithm 
for constructing Voronoi diagrams by building planar subdivisions. 

Notation: Throughout this section, n and in denote the number of vertices and the 
number of edges, respectively, of the graph that is being maintained and queried dy- 
namically. 
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17.5.1 DYNAMICALLY MAINTAINABLE PROPERTIES 


Definitions: 

A ( dynamic ) update operation is an operation on a graph that keeps track whether 
the graph has a designated property. 

A query is a request for information about the designated property. 

Facts: 

1. The primitive update operations for most dynamic graph algorithms are edge inser- 
tions and deletions and, in the case of edge-weighted graphs, changes in edge weights. 

2 . For most dynamic graph algorithms, insertion or deletion of an isolated vertex can 
be accomplished by an easy modification of a non-dynamic algorithm. 

3 . The insertion or deletion of vertices together with their incident edges is usually 
harder and has to be done by iterating the associated edge update operation. 

4 . There is a trade-off between the time required for update operations and the time 
required to respond to queries about the property being maintained. Thus, running 
times of the update operations depend strongly on the property being maintained. 

5 . Nontrivial dynamic algorithms corresponding to several graph properties are known 
(see Examples). 

Examples: 

1 . Connectivity: The permitted query is whether two vertices x and y are in the 
same component. Permitted updates are edge insertions, edge deletions, and isolated 
vertex insertions. Frederickson [Fr85] provides an algorithm for maintaining minimum 
spanning forests that can easily be adapted to this problem. Improvements in running 
times have been achieved by [EpEtal92] and [EpGaIt93]. 

2 . Bipartiteness: Update operations are the same as for Connectivity (Example 1). A 
query simply asks whether a graph is bipartite. An algorithm is presented in [EpEtal92], 
with an improvement in [EpGaIt93]. 

3 . Minimum spanning forests: The query is whether an edge is in a minimum spanning 
forest. The graph is weighted, and the update operations are increments and decrements 
of weights. (Edge insertion is accomplished by lowering the edge weight from oo and edge 
deletion by incrementing the edge weight to oo.) [Fr85] contains the early result, with 
improvements by [EpGaIt93]. The plane and planar graph cases have been considered 
by [EpEtal92] and [EpEtal93]. 

4 . Biconnectivity and 2-Edge Connectivity: Update operations are the same as for 
Connectivity (Example 1). Queries ask whether two given vertices lie in the same bi- 
connected (resp., 2-edge connected) component. Efficient algorithms for maintaining 
biconnectivity are found in [EpEtal92], [EpGaIt93], [Ra93], and [HeRaSu94]. Efficient 
algorithms for maintaining 2-edge connectivity are found in [EpEtal92], [Fr91], and 
[Fr85]. Any algorithm for dynamically maintaining biconnectivity translates to an al- 
gorithm with the same time bounds for 2-edge connectivity [Galt91]. 

5 . Planarity: Update operations include edge insertions and deletions. Queries ask 
whether the graph is currently planar. Variants include queries that would test whether 
the addition of a particular edge would destroy the current imbedding. Algorithms are 
described in [EpEtal93] and [Ra93]. 
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17.5.2 TECHNIQUES 


Definitions: 

A partially dynamic algorithm is usually an algorithm that handles only edge inser- 
tions and, for edge-weighted graphs, decrements in edge weights. Less commonly, this 
term can refer to an algorithm that handles only edge deletions or weight increments. 

A cluster in a spanning tree T for a graph G is a set of vertices such that the subgraph 
of T induced on these vertices is connected. 

An ambivalent data structure is a structure that, at many of its vertices, keeps track 
of several alternatives, despite the fact that a global examination of the structure would 
determine which of these alternatives is optimal. 

A certificate for property V and graph G is a graph G' such that G has property V if 
and only if G' has property V. 

A strong certificate for property V and graph G is a graph G' , on the same vertex 
set as G, such that, for every graph H, G U H has property V if and only if G' U H has 
property V. 

A sparse certificate is a strong certificate in which the number of edges is 0(n). 

A function A that maps graphs to strong certificates is stable if it satisfies: 

• A(GUH) = A(A(G)UH); 

• A(G — e) differs from A(G) by 0(1) edges, where e is an edge in G. 

A stable certificate is one produced by a stable mapping. 

A plane graph is a planar graph, together with a particular imbedding in the plane. 

A compressed certificate for a property V of G, where G = (V. E ) is a subgraph of a 
larger graph F and X C V separates G from F — G, is a small certificate G' = ( V \ E') 
with X C V' such that, for any graph H that is attached to G only at the vertices of X, 
H U G has property V if and only if H U G' does, and \V'\ = 0(|X|). 

A graph property V is dyadic if it is defined with respect to a pair of vertices ( x,y ). 

A graph C is a certificate of a dyadic property V for X in G if and only if, for 
any H with V(H) fl V(G) C X and every x and y in V(H), V is true for (x,y) in the 
graph G U H if and only if it is true for ( x , y) in the graph C U H . 


Facts: 

1. Using the union- find data structure [Ta75], it is possible to maintain connectivity 
information in 0(a(m.,n)) amortized time per update or query. 

2 . For other graph properties, such as 2-edge connectivity and biconnectivity, a data 
structure called the link/condense tree [WeTa92] maintains information in 0(a(m,n)) 
amortized time per update or query. 

3 . The link/condense tree supports the operation of condensing an entire path in the 
tree into a single vertex. This is important in the applications considered, because the 
insertion of an edge may cause several biconnected components or 2-edge connected 
components to be combined into one. 

4 . Link/condense trees are based on dynamic trees. [SlTa83] 
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Algorithm 1 : Frederickson: to maintain a minimum spanning tree. 

Prepro cessing : 

find a minimum spanning tree T of the initial graph G 
maintain a dynamic tree of T, using [SlTa83] 

for 2 := n 2 / 3 group the vertices of T into clusters whose sizes are between z 
and 3z — 2 {There will be ©(n 1 / 3 ) clusters.} 
for each pair of clusters i, j maintain the set of edges £© as a min-heap 

Updates: Decreases in tree edge weights do not change anything, and increases in 
non-tree weights can be handled by a suitable update of the appropriate min- 
heap. Handle decreases in non-tree edge weights by using the dynamic tree 
appropriately. If tree edge e increases in weight, remove it, thus partitioning 
the clusters into two sets. Find an edge of minimum cost between clusters on 
opposite sides of the partition. 


5 . The dynamic tree data structure maintains a set of rooted trees. It supports the 
operations of linking the root of one tree as the child of a vertex in another tree, cutting 
a tree at a specified edge, and everting a tree to make a specified vertex the root in 
worst-case O(logn) time per operation. It also supports other operations based on keys 
stored at vertices, such as finding the minimum key on the path from a given vertex to 
the root in O(logn) time. 

6. To maintain minimum spanning trees in an edge- weighted, connected graph subject 
to changes in edge weights, Frederickson [Fr85] uses clustering and topology trees in 
Algorithm 1. 

7 . Ambivalence may permit faster updates, possibly at the cost of slower queries. 

8. Frederickson [Fr91] presents an ambivalent data structure for spanning forests that 
builds upon the ideas of multilevel partitions and (2-dimensional) topology trees that 
he developed in [Fr85]. 

9 . Let V be a property for which sparse certificates can be found in time 
Suppose that there is a data structure that can be built in time g(n , to) and permits 
static testing of property V in time q(n,m). Then there is a fully dynamic data struc- 
ture for testing whether a graph has property V ; update time for this structure is 
f(n, 0(n))0(log(™)) + g(n, 0(n)), and query time is q(n, 0(n)). This “basic sparsifica- 
tion technique” is used to dynamize static algorithms. To use it, one need only be able 
to compute sparse certificates efficiently. 

10 . The sparsifi ca tion method of [EpEtal92] is to partition the input graph into sparse 
subgraphs (with 0(n) edges) and summarize the relevant information about each sub- 
graph in an even sparser “certificate” . Certificates are merged in pairs, producing larger 
subgraphs that are themselves sparsified using the certificate technique. The result is 
a balanced binary tree in which each vertex is a sparse certificate. Each insertion or 
deletion of an edge in the input graph causes changes in log(™) tree vertices. Because 
these changes occur in graphs with Oin ) edges, instead of the to edges in the input 
graph, time bounds for updates are reduced in most natural problems. 

11 . Let V be a property for which stable sparse certificates can be maintained in time 
/(n, to) per update. Suppose that there is a fully dynamic algorithm for V with update 
time g(n,m) and query time q{n,m). Then this algorithm can be sped up; specifi- 
cally, V can be maintained fully dynamically in time f(n , 0(n))0( log(™)) + g(n, 0(n )) 
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per update, with query time q(n,0(n)). Because this “stable sparsification technique” 
is used to speed up existing dynamic algorithms, it often yields better results than the 
basic sparsification technique described above. However, to use it, one must be able to 
maintain stable sparse certificates efficiently; this is a more stringent requirement than 
what is needed to apply basic sparsification. 

12 . Eppstein, Galil, and Italiano [EpGaIt93] improve the sparsification technique to get 
rid of the log(— ) factor in these bounds. They achieve this improvement by partitioning 
the edge set of the original graph more carefully. 

13 . Dynamic algorithms restricted to plane graphs have been considered by several 
authors. Eppstein et a 1. [EpEtal93] introduce a variant of sparsification that permits the 
design of efficient dynamic algorithms for planar graphs in an imbedding-independent 
way, as long as the updates to the graph preserve planarity. Because these graphs are 
already sparse, Eppstein et al. design a separator-based sparsification technique. 

14 . The fact that separator sizes are sublinear (0(i/n) for planar graphs) allows 
the possibility of maintaining sublinear certificates. Eppstein etal. [EpEtal93] use a 
separator-based decomposition tree as the sparsification tree and show how to compute 
it in linear time and maintain it dynamically. They use it to show the following: For a 
property V for which compressed certificates can be built in time T(n), a data structure 
for testing V built in time P(n), and queries answered in time Q(n), a fully dynamic 
algorithm for maintaining V under planarity-preserving insertions and deletions takes 
amortized time P(0(v}/ 2 )) + T(0(n 1 / 2 )) per update and (^(O© 1 / 2 )) per query. 

15 . A dyadic property V , for which compressed certificates can be built in time T(n ), 
a data structure for testing V built in time P(n), and queries answered in time Q(n), 
can be maintained with updates taking T(0(n 1 / 2 )) amortized time and queries taking 
P(0(n 1,/2 )) + ^(O© 1 / 2 )) + T(0(?i 1 / 2 )) worst-case time. 

16 . In dealing with plane (as opposed to planar) graphs and allowing only updates that 
can be performed in a planarity-preserving manner on the existing imbedding, simpler 
techniques that rely on planar duality can be used [EpEtal92], 

17 . When maintaining minimum spanning trees under updates that change only edge 
weights, the most difficult operation to handle is an increase in the weight of an MST 
edge. However, in the dual graph this can be viewed as a decrease in the weight of a non- 
MST edge. This idea and the handling of edge insertions and deletions are addressed by 
the data structures of [GuSt85] and the edge-ordered tree data structure of [EpEtal92]. 
These data structures help maintain the subdivision and its dual in the face of general 
updates and also help perform required access operations efficiently. Edge-ordered trees 
are an adaptation of the dynamic trees of [SlTa83] . 

18 . Knowledge of the imbedding allows one to use topology trees in more efficient 
ways. Specifically, Rauch [Ra94] partitions the non-tree edges into equivalence classes 
called bundles. In the cyclical ordering of edges emanating from a cluster, bundles are 
carefully chosen, consecutive subsets of edges. 


17.5.3 APPLICATIONS 
Examples: 

1 . Bipartiteness [EpGaItNi92], [EpGaIt93]: A graph that is not bipartite contains an 
odd cycle. The graph formed by adding the shortest edge inducing an odd cycle (if any) 
to the minimum spanning forest is a stable certificate of (non-)bipartiteness. Using the 
clustering techniques of [Fr85] and the improved sparsification techniques of [EpGaIt93], 
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this certificate can be maintained in 0(n}0) time per update. The query time in this 
example is 0(1); one bit is used to indicate whether the operation is maintaining a 
certificate of bipartiteness or of non-bipartiteness. 

2. Minimum spanning forests [EpGaItNi92], [EpGaIt93], [Fr85]: In this example, the 
goal is not to maintain a data structure that supports efficient testing of a property, 
but rather to maintain the minimum spanning forest itself as edges are added to and 
deleted from the input graph. It is shown in [EpGaItNi92] how to define a canonical 
minimum spanning forest that serves as the analogue of a stable sparse certificate. 
Frederickson [Fr85] uses the topological approach to obtain a fully dynamic algorithm 
that maintains minimum spanning forests in time 0(m 1 ^ 2 ) per update. Applying the 
improved stable sparsification technique with f(n,m) = g(n,m) = O^n 1 ^ 2 ) yields a 
fully dynamic minimum spanning forest algorithm with update time 0(n}^ 2 ). For plane 
graphs, [EpEtal92] show that both updates and queries can be performed in O(logn) 
time per operation; in planar graphs, [EpEtal93] show that 0( log 2 n) per deletion and 
0(log n) per insertion are sufficient. 

3. Connectivity [EpGaItNi92], [EpGaIt93], [Fr85]: Simple enhancements to the min- 
imum spanning forest algorithms in [Fr85] yield fully dynamic algorithms for the con- 
nectivity problem in which the update times are the same as they are for minimum 
spanning forests, and the query times are 0(1). Thus, as in the previous example, 
applying improved stable sparsification with f(n,m) = g(n,m) = (^(to 1 / 2 ) yields a 
fully dynamic connectivity algorithm with update time O© 1 / 2 ) and query time 0(1). 
Similarly, the planar and plane graph algorithms for minimum spanning trees can be 
generalized to work for minimum spanning forests and adapted to maintain connected 
components. 

4. Biconnectivity [EpGaItNi92], [EpGaIt93], [Ra93], [HeRaSu94]: Cheriyan, Kao, and 
Thurimella [ChKaTh93] show that Ci = Ci U is a sparse certificate for biconnec- 
tivity, where C\ is a breadth-first spanning forest of the input graph G, and B 2 is a 
breadth- first spanning forest of G — C\. Eppstein et a 1. [EpGaItNi92] show that O 2 is 
in fact a strong certificate of biconnectivity. These strong certificates can be found in 
time 0(m), using classical breadth-first search algorithms. Applying improved sparsifi- 
cation with f(n, m) = g(n , to) = 0(m ) yields a fully dynamic algorithm for maintaining 
the biconnected components of a graph that has update time 0(n). 

The approach to biconnectivity in [Ra94] is to partition the graph G into clusters 
and decompose a query that asks whether vertices u and v lie in the same biconnected 
component into a query in the cluster of u, a query in the cluster of v, and a query 
between clusters. The 2-dimensional topology tree is adapted in a novel way, and the 
ambivalent data structures previously defined for connectivity and 2-edge connectivity 
are extended to test biconnectivity between clusters. To test biconnectivity within a 
cluster C, the entire subgraph induced by C and a compressed certificate of G — C are 
maintained. Using all these ingredients, [Ra94] obtains amortized C^jti 1 / 2 ) time per 
update and 0(1) worst-case time per query. 

Using clever data structures based on topology trees, bundles, and the idea of 
recipes, first introduced in this context in [HeRaSu94], the problem of fully dynamic 
biconnectivity for plane graphs can be solved in 0(log 2 n) time per update and 0(log?i) 
time per query. 

5. 2-edge connectivity [EpGaItNi92], [Fr91], [Fr85]: Thurimella [Th89] and Nagamochi 
and Ibaraki [Nalb92] show that the following structure U 2 is a certificate for 2-edge 
connectivity: U 2 = U\ U F 2 , where U\ is a spanning forest of G , and F 2 is a spanning 
forest of G — U\. Eppstein et a 1. [EpGaItNi92] show that U 2 is in fact a stable, 
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sparse certificate. Frederickson’s minimum spanning forest algorithm [Fr85] can be 
adapted to maintain U 2 in time f(n,m) = 0(m . 1 / 2 ). Frederickson’s ambivalent data 
structure technique [Fr91] can be used to test 2-edge connectivity with update time 
g(n,m) = 0(to 1 / 2 ) and query time q(n,m) = O(logn). Here a “query” is a pair of 
vertices, and the answer is “yes” if they are in the same 2 -edge connected component 
and “no” otherwise. Applying improved stable sparsification yields a fully dynamic 
algorithm with update time 0 (?r 1 / 2 ) and query time O(logn). 

6 . Planarity [EpEtal93], [Ra93]: Eppstein et al. [EpEtal93] use the separator-based 
sparsification technique described above to obtain a fully dynamic planarity-testing 
algorithm for general graphs that answers queries of the form “is the graph currently 
planar?” and “would the insertion of this edge preserve planarity?”. Their algorithm 
requires amortized running time O© 1 / 2 ) per update or query. Italiano, La Poutre, and 
Rauch [ItLaRa93] use topology trees, bundles, and recipes to obtain a fully dynamic 
algorithm on plane graphs that tests whether the insertion of a particular edge would 
destroy the given imbedding. Their algorithm requires time 0(log 2 n) for updates and 
queries. 


17.5.4 RECENT RESULTS AND OPEN QUESTIONS 


Examples: 

1 . Alberts and Henzinger [AlHe95] investigate dynamic algorithms on random graphs 
with n vertices and mo edges on which a sequence of k arbitrary update operations is 
performed. They obtain expected update times of 0(fcfogn + Yli=i yjw) f° r minimum 

spanning forest, connectivity, and bipartiteness and 0(k\ogn + ©log n ^ 7 =) for 
2-edge connectivity. The data structures required for these algorithms use linear space, 
and the preprocessing times match those of the best algorithms for finding a minimum 
spanning forest. 

2. Fredman and Rauch [FrRa94] investigate lower bounds in the cell probe model of 
computation and obtain good results for k-edge connectivity, fc-vertex connectivity, 
and planarity-testing of imbedded planar graphs. Both average-case analysis and lower 
bounds are important topics for future research on dynamic graph algorithms. 

3. Klein et al. [KlEtal94] give a fully dynamic algorithm for the all-pairs shortest path 
problem on planar graphs. If the sum of the absolute values of the edge-lengths is D, 
then the time per operation is 0 (n 9 / 7 log D) (worst case for queries, edge deletion, 
and length changes, and amortized for edge insertion); the space requirement is 0(n). 
Several types of partially dynamic algorithms for shortest paths appear in [AuEtal90], 
[EvGa85], [FrMaNa94], and [Ro85]. 

Although it is one of the most important dynamic graph algorithms problems, 
there is less known about shortest paths than about many other problems, and this is 
an important topic for future study. 

4. In a recent breakthrough, Henzinger and King [HeKi95] obtained fully dynamic, 
randomized algorithms for connectivity, 2 -edge connectivity, bipartiteness, cycle equiva- 
lence, and const ant- weight minimum spanning trees that have polylogarithmic expected 
time per operation. 
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